names. In most cases these lists of instructions are
not unique to the code they came from. Instead they
are exceptionally common both as literal copies and
also as subsets of one another. As such there can be
no sense in which intellectual property rights or the
terms of any license have been violated.
6 CONCLUSIONS
We have described a technique for partitioning the
IP search space using instruction subsets. This
enables us to distribute IP work across many
computer cores by assigning each a distinct but
overlapping subset of instructions. Testing suggests
the subsets generalise quickly, particularly when
they are merged. Cross-validation shows they should
work well with unseen code. The approach
significantly reduces the size of the search space.
Any duplication of effort due to subset overlap
quickly becomes insignificant as program size
increases. We also believe that our approach is
ethical and does not exploit open source developers.
ACKNOWLEDGEMENTS
This work was supported by Zoea Ltd. Zoea is a
trademark of Zoea Ltd. Other trademarks are the
property of their respective owners.
REFERENCES
Brown, T. B.; Mann, B.; Ryder, N.; Subbiah, M.; Kaplan,
J.; Dhariwal, P.; Neelakantan, A.; Shyam, P.; Sastry,
G.; Askell, A.; Agarwal, S.; Herbert-Voss, A.;
Krueger, G.; Henighan, T.; Child, R.; Ramesh, A.;
Ziegler, D. M.; Wu, J.; Winter, C.; Hesse, C.; Chen,
M.; Sigler, E.; Litwin, M.; Gray, S.; Chess, B.; Clark,
J.; Berner, C.; McCandlish, S.; Radford, A.; Sutskever,
I.; Amodei, D. (2020). Language Models are Few-Shot
Learners. arXiv pre-print. arXiv:2005.14165 [cs.CL].
Ithaca, NY: Cornell University Library.
Cropper‚ A.; Dumancic, S.; Muggleton, S. H. (2020).
Turning 30: New Ideas in Inductive Logic
Programming. In Proceedings of the Twenty−Ninth
International Joint Conference on Artificial In-
telligence‚ IJCAI 2020. ijcai.org. pp. 4833–4839.
doi.org/10.24963/ijcai.2020/673.
Flener, P., Schmid, U. (2008). An introduction to
inductive programming. Artificial Intelligence Review
29(1), 45-62. doi.org/10.1007/s10462-009-9108-7.
Galwani, S.; Hernandez-Orallo, J.; Kitzelmann, E.;
Muggleton, S. H.; Schmid, U.; Zorn, B. (2015).
Inductive Programming Meets the Real World.
Communications of the ACM 58(11), 90–99. doi.org/
10.1145/2736282.
Kilby, P.; Slaney, J. K.; Thiébaux, S.; Walsh, T. (2006).
Estimating Search Tree Size. In Proceedings of the
Twenty-First National Conference on Artificial
Intelligence AAAI 2006. AAAI Press. 1014-1019.
Kitzelmann, E. (2010). Inductive programming: A survey
of program synthesis techniques. Approaches and
Applications of Inductive Programming. Lecture
Notes in Computer Science 5812, 50–73. Springer-
Verlag.
Lemley, M. A., Casey B. (2021). Fair Learning. Texas
Law Review 99(4): 743-785.
Louridas, P.; Spinellis, D.; Vlachos. V. (2008). Power
laws in software. ACM Transactions on Software
Engineering and Methodology 18(1): 1-26.
doi.org/10.1145/1391984.1391986.
Mart, R., Pardalos, P. M., Resende, M. G. C. (2018)
Handbook of Heuristics. Springer Publishing
Company. 1
st
edition.
Martelli, A.; Ravenscroft, A.; Holden, S. (2017). Python in
a Nutshell. O'Reilly Media, Inc. 3
rd
edition.
Microsoft. (2022). GitHub. https://www.github.com.
Accessed: 2022-11-06.
McDaid, E., McDaid, S. (2019). Zoea – Composable
Inductive Programming Without Limits. arXiv
preprint. arXiv:1911.08286 [cs.PL]. Ithaca, NY:
Cornell University Library.
McDaid, E., McDaid, S. (2021). Knowledge-Based
Composable Inductive Programming. In Proceedings
Artificial Intelligence XXXVIII: 41st SGAI
International Conference on Artificial Intelligence, AI
2021, Cambridge, UK, December 14–16, 2021,
Springer-Verlag. doi.org/10.1007/978-3-030-91100-
3_13.
Nguyen, N., Nadi, S. (2022). An Empirical Evaluation of
GitHub Copilot's Code Suggestions. In Proceedings of
the IEEE/ACM 19th International Conference on
Mining Software Repositories (MSR), 2022, 1-5.
doi.org/10.1145/3524842.3528470.
Petke, J.; Haraldsson, S.; Harman, M.; Langdon, W. B.;
White, D.; Woodward, J. (2018). Genetic
Improvement of Software: a Comprehensive Survey.
IEEE Transactions on Evolutionary Computation.
22(3): 415-432. doi.org/10.1109/TEVC.2017.
2693219.
Ray, B.; Posnett, D.; Filkov, V.; Devanbu, P. (2014). A
large scale study of programming languages and code
quality in github. In Proceedings of the 22nd ACM
SIGSOFT International Symposi-um on Foundations
of Software Engineering (FSE 2014). Association for
Computing Machinery. 155–165. doi.org/10.1145/
2635868.2635922.
Xu, F. F.; Alon, U.; Neubig, G.; Hellendoorn, V. J. (2022).
A systematic evaluation of large language models of
code. In Proceed-ings of the 6th ACM SIGPLAN
International Symposium on Machine Programming.
Association for Computing Machinery. 1–10. doi.
org/10.1145/3520312.3534862.