Commit d46ce10
authored
Add Positron Assistant Eval (#10883)
This fixes some inpsect-ai test issues and adds a new eval for a
hallucination
- Update model selector from the 1.106 merge
- Fixed issues where chat message was too quick and didn't wait for
response to finish
- Change `sample_1` to be a relevant eval checking for previuosly seen
hallucination
### QA Notes
@:assistant
See eval run at
https://github.com/posit-dev/positron/actions/runs/19872015490
Note - Failures are because it failed the eval.. which happens sometimes
and the main point of this test.1 parent a511d50 commit d46ce10
File tree
5 files changed
+20
-9
lines changed- .github/workflows
- test
- assistant-inspect-ai
- e2e
- pages
- tests/inspect-ai
5 files changed
+20
-9
lines changed| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
100 | 100 | | |
101 | 101 | | |
102 | 102 | | |
103 | | - | |
| 103 | + | |
104 | 104 | | |
105 | 105 | | |
106 | 106 | | |
| |||
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
82 | 82 | | |
83 | 83 | | |
84 | 84 | | |
85 | | - | |
| 85 | + | |
86 | 86 | | |
87 | 87 | | |
88 | 88 | | |
| |||
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
3 | 3 | | |
4 | 4 | | |
5 | 5 | | |
6 | | - | |
| 6 | + | |
7 | 7 | | |
8 | | - | |
| 8 | + | |
9 | 9 | | |
10 | | - | |
| 10 | + | |
11 | 11 | | |
12 | 12 | | |
13 | 13 | | |
14 | 14 | | |
15 | 15 | | |
16 | 16 | | |
17 | 17 | | |
18 | | - | |
| 18 | + | |
19 | 19 | | |
20 | 20 | | |
21 | 21 | | |
22 | 22 | | |
23 | 23 | | |
24 | 24 | | |
25 | 25 | | |
26 | | - | |
| 26 | + | |
27 | 27 | | |
28 | 28 | | |
29 | 29 | | |
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
35 | 35 | | |
36 | 36 | | |
37 | 37 | | |
38 | | - | |
| 38 | + | |
39 | 39 | | |
40 | 40 | | |
41 | 41 | | |
| |||
188 | 188 | | |
189 | 189 | | |
190 | 190 | | |
| 191 | + | |
| 192 | + | |
191 | 193 | | |
192 | 194 | | |
193 | 195 | | |
| |||
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
122 | 122 | | |
123 | 123 | | |
124 | 124 | | |
| 125 | + | |
125 | 126 | | |
126 | 127 | | |
127 | 128 | | |
| |||
145 | 146 | | |
146 | 147 | | |
147 | 148 | | |
| 149 | + | |
| 150 | + | |
| 151 | + | |
| 152 | + | |
148 | 153 | | |
149 | 154 | | |
| 155 | + | |
150 | 156 | | |
151 | 157 | | |
152 | 158 | | |
153 | 159 | | |
154 | 160 | | |
155 | 161 | | |
| 162 | + | |
156 | 163 | | |
157 | 164 | | |
158 | 165 | | |
159 | 166 | | |
160 | 167 | | |
161 | 168 | | |
| 169 | + | |
| 170 | + | |
| 171 | + | |
162 | 172 | | |
163 | 173 | | |
164 | 174 | | |
| |||
184 | 194 | | |
185 | 195 | | |
186 | 196 | | |
187 | | - | |
188 | 197 | | |
189 | 198 | | |
190 | 199 | | |
| |||
0 commit comments