-
Notifications
You must be signed in to change notification settings - Fork 5.2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Create Aider Agent #120
Comments
This is really interesting--thanks for bringing it up! The improvement from using diffs is impressive. But I imagine the logic for applying them is...messy. This would be really interesting for an agent to try out |
I also really like the idea of the user telling the agent which files to focus on! |
Glad you like it!
To me on the contrary it seems cleaner especially for long files and project. Also things like symbex for python seem very promising to allow a bird's eye view of a project by seeing only the signature of each functions. Like a human would. I'm sure there exist other general parser for multiple languages. But the diff thing is a priority in my mind. |
As a user of aider chat I love aider chat the most out of all the tools available. One big thing where aider also adds value is to work with existing codebase. can we prioritize this as well into the opendevin project. I guess you already told this. But aider.chat is the tool which can be utilized for existing repository and it understands all the symbols in the repository. it creates a repository map which contains all the symbols which is extremely good on pinpointing the changes. |
This blog about how they use |
Our new OpenDevin CodeAct agent implements some of the tools from SWE-Agent that make it possible to do many of the things that aider supports. If there is interest in implementing an aider agent we'd be happy to have contributions, but I'm going to close the issue as unplanned for now unless someone is interested in doing this! |
watching the llamaindex webinar with @rbren now - i think an aider "microagent" would be insanely powerful. specifically - i think it could help with some of the context window mgmt challenges. paul has put an insane amount of work into to refining the diff structure within aider. and if aider is just a tool or a micro-agent, then the parent agent can just see if things works and doesn't necessarily need to be bothered with the details of what the aider tool / microagent did. |
See also: |
Yeah aider's benchmark score is insanely high! We should definitely incorporate aider (in some form). |
@rbren @deniz-birlikci and I were talking about the logistics of doing this on slack, here are some details: Regarding benchmark scores, aider is doing a thing where they repeat over and over again, up to 6 times, if aider doesn't come up with a test-runnable/lintable solution. So the scores are actually a bit lower (~20%) if they only try once. But I think aider definitely has some good ideas incorporated so it's worth trying. From @rbren: We might want to pull from Aider in a piecemeal way, rather than importing them as a dependency (we actually can't add it rn due to a conflict in playwright versions anyways--I guess they're working on browsing?)
|
We need this after it passes more than 40 % on SWE Bench |
@PierrunoYT What is significant about 40%? |
Its name is Aide, seems not the same as Aider? @PierrunoYT |
Definitely seems to be different to aider. Context:
I couldn't see a PR submission for And since I hadn't seen that MentatBot on the SWE-Bench leaderboard either, here's the blog link + results submission PR for it: |
Stack graphs may also help in the 'code search/context' space of things (similar to |
Here is a link to a twitter thread explaining it: https://x.com/skcd42/status/1806640696662675469 |
@assertion @0xdevalias Yeah I wrote it wrong and forgot to edit it. |
Summary
There's an open source AI pair programming tool called aider that implements something interesting to you: a bunch of python classes and functions to ask the LLM to output only the diff to apply instead of writing the whole code. This both reduces the chances of errors and greatly reduces the number of tokens to write (importantly: the completion tokens are way more expensive that the prompt tokens)
Motivation
Reduce token cost and errors.
Technical Design
A report showcasing their suff can be found here. Most of the code is here and the prompts are here.
As you can see lots of though when into this because the LLM has otherwise trouble with the number of lines etc.
Alternatives to Consider
None that I know of.
Additional context
For a personnal project I inquired about using only the functions of aider and you can read the issue here
Also, hearing about OpenDevin made me hear about devika too so I'll be posting this exact same issue on their repo too.
The text was updated successfully, but these errors were encountered: