Online Data Analytics Course is now on Summer Sale! Enroll today for classes starting on May 27th. test button

Follow Us

aaaaaaaaaaaa

Elizabet...

PS Content Team: These days, you can’t seem to open the news without hearing about someone using GPT-4 in some strange, new way. People have started experimenting with using AI to fix code, not just on-demand, but automatically, and re-running until that code is fixed. What do you think about this type of “self-fixing” or “regenerative” code? Could it replace the QA process?

Mattias: Well, my immediate thought is that it’s going to work until it blows up! I could imagine someone saying “I don’t need to get people to debug this”, which is sort of like putting a brick on the accelerator pedal, which only works until your car goes in a ditch. 

If you’ve got an AI changing the code until it runs, it’s not fixing the true bugs, but just the syntax errors that are crashing the thing. And that’s the main part of software development, digging into the root cause. A syntax error can be valuable to help you figure out the true error, but you might well have made the mistake somewhere else, not just there. After all, what if the AI decides the best way to fix the problem is to simply comment out the problematic code?

Lars: I think it sounds great for fixing actual compilation errors, but unless it has context around business use cases, it’s going to be limited. For example, there might be an error where it doesn’t let you refund someone 1000 dollars instead of 5 dollars, and you might not want that to happen as a business use case, but the AI who’s fixing the code might not know that.

Mattias: Yeah, that’s a good example. The AI might determine the solution is to disable the error message, and that’ll "fix" the code, but it won’t do the right thing. That’s what I mean by things being full bore into a brick wall.

Lars: There’s especially cases in software development where you throw errors as a way of catching these types of business logic errors, it’s best practice. You’re throwing errors like “this number is too big” for the number of students that should be in a class. It serves as a notification for that kind of issue, and if you just “fix” that, that’s a problem. I’m sure technology will come to a point where it can accommodate for that, but we’re not quite there yet.

PS Content Team: What about if the AI tells you what errors it’s fixed, and gives you the option to approve it? I believe Wolverine, the recently released program that fixes Python programs, tells you what it’s “fixed.” Does the ability to review make a difference, and would you use self-fixing code then?

Jeremy: Yeah, that sounds like something I’d use.

Mattias: If it’s giving me things I can review, that’s something I’d potentially use. That makes me more productive, and is different from things going into production without review. That would be like allowing an intern to work on your code without review, which is dumb. The whole point is to have appropriate checks and balances. 

Jeremy: You shouldn't ever have anything going into production on its own. Humans shouldn’t edit code in production, though they sometimes do! The same is true for AI, it's not special.

Lars: I’d absolutely use it, but like the others, I wouldn’t let it roam free in production. I’d treat it like a code review by a team member. And whatever you review can help inform and teach that model, so it’s valuable for improving its ability to help you.

PS Content Team: What do you think about "autonomous" AI agents that look GPT-4 outputs, like AutoGPT and BabyAGI, that iteratively complete complex tasks? Do you think there's business risk or opportunity there? How mature does this technology sound?

Lars: Automation has always been the holy grail of most software development, and autonomous AI agents are another step in the process. As mentioned before, the risk is the lack of context. Unless you feed the model/agent with enough information to understand nuances and edge cases, the autonomy can result in an output that just isn't what you really wanted. Maturity is just not there yet in my opinion, but as with anything AI that might come quick.

 

WhatsApp