Create Aider Agent #120

thiswillbeyourgithub · 2024-03-24T11:09:15Z

Summary
There's an open source AI pair programming tool called aider that implements something interesting to you: a bunch of python classes and functions to ask the LLM to output only the diff to apply instead of writing the whole code. This both reduces the chances of errors and greatly reduces the number of tokens to write (importantly: the completion tokens are way more expensive that the prompt tokens)

Motivation
Reduce token cost and errors.

Technical Design
A report showcasing their suff can be found here. Most of the code is here and the prompts are here.
As you can see lots of though when into this because the LLM has otherwise trouble with the number of lines etc.

Alternatives to Consider
None that I know of.

Additional context
For a personnal project I inquired about using only the functions of aider and you can read the issue here
Also, hearing about OpenDevin made me hear about devika too so I'll be posting this exact same issue on their repo too.

rbren · 2024-03-24T22:15:09Z

This is really interesting--thanks for bringing it up!

The improvement from using diffs is impressive. But I imagine the logic for applying them is...messy. This would be really interesting for an agent to try out

rbren · 2024-03-24T22:15:48Z

I also really like the idea of the user telling the agent which files to focus on!

thiswillbeyourgithub · 2024-03-25T08:07:15Z

Glad you like it!

messy

To me on the contrary it seems cleaner especially for long files and project.

Also things like symbex for python seem very promising to allow a bird's eye view of a project by seeing only the signature of each functions. Like a human would. I'm sure there exist other general parser for multiple languages.

But the diff thing is a priority in my mind.

cloudbow · 2024-03-28T10:40:04Z

As a user of aider chat I love aider chat the most out of all the tools available. One big thing where aider also adds value is to work with existing codebase. can we prioritize this as well into the opendevin project. I guess you already told this. But aider.chat is the tool which can be utilized for existing repository and it understands all the symbols in the repository. it creates a repository map which contains all the symbols which is extremely good on pinpointing the changes.

0xdevalias · 2024-04-05T01:41:13Z

This blog about how they use tree-sitter to build a graph of the repo/code is also really interesting/useful:

https://aider.chat/docs/repomap.html

neubig · 2024-05-11T15:47:55Z

Our new OpenDevin CodeAct agent implements some of the tools from SWE-Agent that make it possible to do many of the things that aider supports. If there is interest in implementing an aider agent we'd be happy to have contributions, but I'm going to close the issue as unplanned for now unless someone is interested in doing this!

rawwerks · 2024-05-24T22:38:52Z

watching the llamaindex webinar with @rbren now - i think an aider "microagent" would be insanely powerful.

specifically - i think it could help with some of the context window mgmt challenges. paul has put an insane amount of work into to refining the diff structure within aider. and if aider is just a tool or a micro-agent, then the parent agent can just see if things works and doesn't necessarily need to be bothered with the details of what the aider tool / microagent did.

0xdevalias · 2024-05-30T02:52:29Z

https://aider.chat/2024/05/22/swe-bench-lite.html
- Aider scored 26.3% on the SWE Bench Lite benchmark, achieving a state-of-the-art result. The current top leaderboard entry is 20.3% from Amazon Q Developer Agent. The best result reported elsewhere seems to be 25% from OpenDevin.
https://www.swebench.com/
https://github.com/paul-gauthier/aider-swe-bench
- Harness used to benchmark aider against SWE Bench benchmarks
- https://github.com/paul-gauthier/aider-swe-bench#the-aider-agent
  - The "aider agent"
    The "aider agent" is dead simple. It simply invokes aider on a fresh copy the problem's git repo over and over, iterating through the models it's been told to use. Aider is invoked repeatedly until aider reports that it successfully edited the repo without any outstanding edit, lint or test errors. This is a plausible solution, so the agent is done.
    
    Aider is configured with a test command to run all the pre-existing tests in the problem's repo. Aider is also configured to proceed with all its suggestioned actions without any user approval.

See also:

li-boxuan · 2024-05-30T04:36:04Z

Yeah aider's benchmark score is insanely high! We should definitely incorporate aider (in some form).

neubig · 2024-05-30T10:03:30Z

@rbren @deniz-birlikci and I were talking about the logistics of doing this on slack, here are some details:

Regarding benchmark scores, aider is doing a thing where they repeat over and over again, up to 6 times, if aider doesn't come up with a test-runnable/lintable solution. So the scores are actually a bit lower (~20%) if they only try once. But I think aider definitely has some good ideas incorporated so it's worth trying.

From @rbren:

We might want to pull from Aider in a piecemeal way, rather than importing them as a dependency (we actually can't add it rn due to a conflict in playwright versions anyways--I guess they're working on browsing?)
Some ideas:

Add RepoMap to the State object, maybe using the Aider class. Then other agents can take advantage of it
Implement an EditBlockCoder agent, which
- takes in a task that describes the edits to be made
- reads the necessary file
- prompts the LLM in EditBlock format
- translates the response into bash for SEARCH/REPLACE or similar
Pull in the linting functionality

PierrunoYT · 2024-06-30T10:59:56Z

We need this after it passes more than 40 % on SWE Bench

0xdevalias · 2024-07-01T02:00:17Z

We need this after it passes more than 40 % on SWE Bench

@PierrunoYT What is significant about 40%?

assertion · 2024-07-01T02:42:38Z

We need this after it passes more than 40 % on SWE Bench

Its name is Aide, seems not the same as Aider? @PierrunoYT

0xdevalias · 2024-07-01T03:02:29Z

Its name is Aide, seems not the same as Aider?

Definitely seems to be different to aider.

Context:

https://github.com/codestoryai/swe_bench_traces
- At CodeStory, we are building Aide, a new age editor made for working along with agents. Unlike AI engineers which throw users out of the loop and chat/copilots which are very much triggerd by humans, we envison an editor where agents and developers come together to hack and collaborate.
  
  At the time of this commit, the agentic framework powering Aide scores 40.3% setting a new benchmark on SWE-Bench-Lite
- Where are the rest of the runs, and how do you get your accuracy numbers? codestoryai/swe_bench_traces#1
https://codestory.ai/
- We believe, we now have the opportunity and necessity, to fundamentally re-imagine the editor to be a place where both humans and AI can work together.
  
  Our attempt at this mighty goal, is Aide. We're building an editor that bridges the present and the future — equipped to help developers effectively leverage AI in their workflows today, while paving the way for how we imagine programming with AI will look in the future.
https://aide.dev/
- Aide lets you pick an infra provider and model of choice, add your API key and just start coding. All queries made to the model are available to you in a SQLite DB locally, and our prompts are Open Source.
- https://github.com/codestoryai/prompts
  - Contains the prompts we use to talk to various LLMs for different utilities inside the editor

I couldn't see a PR submission for aide's results here though:

https://github.com/swe-bench/experiments

And since I hadn't seen that MentatBot on the SWE-Bench leaderboard either, here's the blog link + results submission PR for it:

0xdevalias · 2024-07-01T03:20:02Z

Also things like symbex for python seem very promising to allow a bird's eye view of a project by seeing only the signature of each functions. Like a human would. I'm sure there exist other general parser for multiple languages.

This blog about how they use tree-sitter to build a graph of the repo/code is also really interesting/useful:

aider.chat/docs/repomap.html

Stack graphs may also help in the 'code search/context' space of things (similar to aider's repo map/etc); it's what powers GitHub's smart code navigation features:

Explore using stack graphs for better code search / navigation / context / repo map / etc #742

neubig · 2024-07-01T03:36:10Z

Here is a link to a twitter thread explaining it: https://x.com/skcd42/status/1806640696662675469

PierrunoYT · 2024-07-01T05:23:11Z

@assertion @0xdevalias Yeah I wrote it wrong and forgot to edit it.

neubig · 2024-07-03T05:13:02Z

I think that actually we can probably close this issue in favor of the more concrete #2185, #2220, #2221

rbren changed the title ~~HeadsUp: Aider is a project implementing robust diff writing for code~~ Create Aider Agent Mar 25, 2024

rbren added the agent framework Strategies for prompting, agent, etc label Mar 25, 2024

rbren added the severity:low Minor issues or affecting single user label Apr 9, 2024

neubig closed this as not planned Won't fix, can't repro, duplicate, stale May 11, 2024

neubig reopened this May 25, 2024

neubig mentioned this issue May 28, 2024

Aider agent #2109

Closed

0xdevalias mentioned this issue May 30, 2024

Explore using stack graphs for better code search / navigation / context / repo map / etc #742

Closed

This was referenced Jun 1, 2024

[Feature]: Aider-inspired RepoMap #2185

Closed

[Feature]: Aider-inspired linting functionality #2220

Closed

[Feature]: Retry on failure functionality #2221

Open

[Feature]: Aider-inspired EditBlock functionality #2222

Closed

mamoodi added the enhancement New feature or request label Jun 19, 2024

neubig closed this as not planned Won't fix, can't repro, duplicate, stale Jul 3, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Create Aider Agent #120

Create Aider Agent #120

thiswillbeyourgithub commented Mar 24, 2024

rbren commented Mar 24, 2024

rbren commented Mar 24, 2024

thiswillbeyourgithub commented Mar 25, 2024

cloudbow commented Mar 28, 2024 •

edited

Loading

0xdevalias commented Apr 5, 2024

neubig commented May 11, 2024

rawwerks commented May 24, 2024 •

edited

Loading

0xdevalias commented May 30, 2024 •

edited

Loading

li-boxuan commented May 30, 2024

neubig commented May 30, 2024

PierrunoYT commented Jun 30, 2024

0xdevalias commented Jul 1, 2024

assertion commented Jul 1, 2024

0xdevalias commented Jul 1, 2024 •

edited

Loading

0xdevalias commented Jul 1, 2024 •

edited

Loading

neubig commented Jul 1, 2024

PierrunoYT commented Jul 1, 2024 •

edited

Loading

neubig commented Jul 3, 2024 •

edited

Loading

Create Aider Agent #120

Create Aider Agent #120

Comments

thiswillbeyourgithub commented Mar 24, 2024

rbren commented Mar 24, 2024

rbren commented Mar 24, 2024

thiswillbeyourgithub commented Mar 25, 2024

cloudbow commented Mar 28, 2024 • edited Loading

0xdevalias commented Apr 5, 2024

neubig commented May 11, 2024

rawwerks commented May 24, 2024 • edited Loading

0xdevalias commented May 30, 2024 • edited Loading

li-boxuan commented May 30, 2024

neubig commented May 30, 2024

PierrunoYT commented Jun 30, 2024

0xdevalias commented Jul 1, 2024

assertion commented Jul 1, 2024

0xdevalias commented Jul 1, 2024 • edited Loading

0xdevalias commented Jul 1, 2024 • edited Loading

neubig commented Jul 1, 2024

PierrunoYT commented Jul 1, 2024 • edited Loading

neubig commented Jul 3, 2024 • edited Loading

cloudbow commented Mar 28, 2024 •

edited

Loading

rawwerks commented May 24, 2024 •

edited

Loading

0xdevalias commented May 30, 2024 •

edited

Loading

0xdevalias commented Jul 1, 2024 •

edited

Loading

0xdevalias commented Jul 1, 2024 •

edited

Loading

PierrunoYT commented Jul 1, 2024 •

edited

Loading

neubig commented Jul 3, 2024 •

edited

Loading