-
Notifications
You must be signed in to change notification settings - Fork 3.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[WIP](segment) Segment footer PB cache #48924
base: master
Are you sure you want to change the base?
Conversation
Thank you for your contribution to Apache Doris. Please clearly describe your PR:
|
run buildall |
2 similar comments
run buildall |
run buildall |
be/src/olap/segment_footer_cache.h
Outdated
// Do load if not found in cache. | ||
// Return nullptr if load failed. | ||
// Return non-null pointer if load success. | ||
std::shared_ptr<segment_v2::SegmentFooterPB> get_or_load(StoragePageCache::CacheKey cache_key, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
方法名字跟所有的cache 保持一致, insert, lookup
be/src/olap/segment_footer_cache.h
Outdated
std::shared_ptr<segment_v2::SegmentFooterPB> _load_and_update() { | ||
std::shared_ptr<segment_v2::SegmentFooterPB> footer = | ||
std::make_shared<segment_v2::SegmentFooterPB>(); | ||
Status load_status = _load_function(footer); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
不要这么写。 在cloud 上,我们同步的确保只有一个线程去load 可能是不对的,比如他访问s3 不稳定之类的。反而我们多个线程同时加载,谁成功用谁的,可以避免多个query 之间互相干扰。
我感觉没必要,如果file cache miss了,那这里并不并发已经没意义了,而且访问 S3 的并发控制应该是File Cache负责的,我们这里考虑本地 file cache 命中的情况。
be/src/util/doris_metrics.h
Outdated
@@ -230,6 +230,7 @@ class DorisMetrics { | |||
IntCounter* scanner_ctx_cnt = nullptr; | |||
IntCounter* scanner_cnt = nullptr; | |||
IntCounter* scanner_task_cnt = nullptr; | |||
IntCounter* column_reader_cnt = nullptr; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
不需要加这个,之前有一个bvar 解决这个问题。
当前,我们已经去不掉bvar了,后面我们想办法把bvar 通过doris metric 这个接口暴露出去就行了。
be/src/olap/segment_footer_cache.h
Outdated
*/ | ||
|
||
class SegmentFooterDataPage : public PageBase<Allocator<false>> { | ||
// Need to implement some virtual methods. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
这么写应该是不够的。
比如说我们没有更新size 这些东东,此时会发现page cache 实际用了很多内存,但是从memtracker 上看用的内存又不多。
得看看page base 里的实现
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
这么写应该是不够的。 比如说我们没有更新size 这些东东,此时会发现page cache 实际用了很多内存,但是从memtracker 上看用的内存又不多。 得看看page base 里的实现
size 跟 capacity 都更新了的,在 load_and_update 里面
6b06388
to
faf4746
Compare
run buildall |
1 similar comment
run buildall |
6acac95
to
e990845
Compare
be/src/olap/page_cache.h
Outdated
template <typename TAllocator> | ||
class PageBase : private TAllocator, public LRUCacheValueBase { | ||
template <typename TAllocator, typename T> | ||
class MemoryTrackedPageBase : protected TAllocator, public LRUCacheValueBase { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
这个类,为什么需要有Allocator 这个模板参数?
be/src/olap/page_cache.h
Outdated
Slice data() const; | ||
|
||
template <typename T> | ||
T as() const { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
用get 不要用as, as 一般表示吧当前this 转换为某个类型
|
||
footer_pb_page->set_data(footer_pb_shared); | ||
|
||
segment_footer_cache->insert(cache_key, (void*)(footer_pb_page.get()), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
我觉得你这个传入 void*, 但是get 出来是shared ptr 我不好说哪个好。
你看如果我们直接能传递一个shared ptr 进去,那么1216 行和1222 行就是一行代码。
而且,从用户角度看,他传入的是一个shared ptr,拿出来的也是一个shared ptr,这样也会比较一致。
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
而且你这个写法,得外部感知到MemoryTrackedPageWithPagePtr,实际这个东东外面不需要感知,外部感知的应该只有一个他自己的footer 对象。 这个MemoryTrackedPageWithPagePtr类是我们内部自己需要导致的一个wrapper 对象。
|
||
footer_pb_page->set_data(footer_pb_shared); | ||
|
||
segment_footer_cache->insert(cache_key, (void*)(footer_pb_page.get()), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
insert(param1, std::shared_ptr param2)
run buildall |
run buildall |
run buildall |
TPC-H: Total hot run time: 32930 ms
|
TPC-DS: Total hot run time: 191969 ms
|
ClickBench: Total hot run time: 31.56 s
|
NED
What problem does this PR solve?
Issue Number: close #xxx
Related PR: #xxx
Problem Summary:
Release note
None
Check List (For Author)
Test
Behavior changed:
Does this need documentation?
Check List (For Reviewer who merge this PR)