Usually I post complete research papers, but this time I wanted to post something different. Over the last few months I talked to a lot of people in the field and I think I have come up with a list of research questions that we as a field need to tackle:
❓ If the AI can produce any code on demand, is it necessary to have modules and dependencies anymore?
Why bother with package management, versioning, supply chain management, API incompatibilities and other problems caused by the dependency hell if you can just ask the AI to implement exactly the functionality you need in your codebase? In other words, do the basic principles of modularity and separation of concerns need to be reconsidered?
❓If the AI context is big enough that entire documentations can be sent to the AI with every request, and the AI is able to perfectly recall every interface detail, are abstractions necessary anymore?
Do we need to build more abstract APIs that hide complex interactions of more and let the AI do the rest? Is the principle of abstraction still relevant?
❓Can the AI just be given data access directly? Do we need encapsulation?
If the AI can just access the entire dataset at once, especially with agents and tools, do we need to protect the internal state of objects or enforce loose coupling?
❓Is “Keep it simple, stupid” still relevant if only the AI is going to read the code?
Can we ask the AI to build hard to read but more robust code? What would happen if, for example, we ask the AI to start every single function with an exhaustive list of validity checks for inputs? All of us were taught to do this, and we all generally agree it would be a good idea but nobody does it because it takes far too long to write and produces unreadable, unmaintainable code.
❓What types of tests do we need if the AI can just generate entire categories of tests perfectly every time?
Developers write a lot of automated tests that AI can just replace, do we still need to write those? And if not, do we need to write some other tests now?
And then there is the following meta-question:
⁉️ Which parts of the codebase will be shared between the human and the LLM, which will be human only and which will be LLM only?
The shared parts of the codebase obviously being the most interesting ones because there the two sets of limits interact the most.
You can see more in-depth analysis of these questions and how I arrived at them on my blog https://lnkd.in/ep4Mx6fS
And If you would like to help me answer some of these questions, my company is hiring, feel free to contact me here on LinkedIn or on [email protected].
#llm #ai #software #softwareengineering #ArtificialIntelligence #GenerativeAI