Skip to content

Commit e23d159

Browse files
swolchokpytorchmergebot
authored andcommitted
[PyTorch][caffe2] Add CAFFE2_{DECLARE,DEFINE}_KNOWN_TYPE (pytorch#83707)
It looks like we aren't getting inlining for the defined `_typeMetaData` functions from CAFFE_KNOWN_TYPE and there's some cost associated with that. I added new macros that fix this problem; I will migrate to them in a follow-up after I get buy-in from reviewers. Differential Revision: [D36883685](https://our.internmc.facebook.com/intern/diff/D36883685/) **NOTE FOR REVIEWERS**: This PR has internal Facebook specific changes or comments, please review them on [Phabricator](https://our.internmc.facebook.com/intern/diff/D36883685/)! Pull Request resolved: pytorch#83707 Approved by: https://github.com/ezyang
1 parent af741e8 commit e23d159

File tree

4 files changed

+144
-44
lines changed

4 files changed

+144
-44
lines changed

c10/util/typeid.cpp

Lines changed: 30 additions & 34 deletions
Original file line numberDiff line numberDiff line change
@@ -1,6 +1,7 @@
11
#include <c10/util/Exception.h>
22
#include <c10/util/typeid.h>
33

4+
#include <algorithm>
45
#include <atomic>
56

67
#if !defined(_MSC_VER)
@@ -27,7 +28,8 @@ C10_EXPORT void _ThrowRuntimeTypeLogicError(const string& msg) {
2728
}
2829

2930
// see TypeMeta::addTypeMetaData
30-
std::atomic<uint16_t> TypeMeta::nextTypeIndex(NumScalarTypes);
31+
std::mutex TypeMeta::typeMetaDatasLock;
32+
uint16_t TypeMeta::nextTypeIndex(NumScalarTypes);
3133

3234
// fixed length array of TypeMetaData instances
3335
detail::TypeMetaData* TypeMeta::typeMetaDatas() {
@@ -53,41 +55,35 @@ detail::TypeMetaData* TypeMeta::typeMetaDatas() {
5355
return instances;
5456
}
5557

56-
CAFFE_KNOWN_TYPE(std::string)
57-
CAFFE_KNOWN_TYPE(uint16_t)
58-
CAFFE_KNOWN_TYPE(char)
59-
CAFFE_KNOWN_TYPE(std::unique_ptr<std::mutex>)
60-
CAFFE_KNOWN_TYPE(std::unique_ptr<std::atomic<bool>>)
61-
CAFFE_KNOWN_TYPE(std::vector<int32_t>)
62-
CAFFE_KNOWN_TYPE(std::vector<int64_t>)
63-
CAFFE_KNOWN_TYPE(std::vector<unsigned long>)
64-
CAFFE_KNOWN_TYPE(bool*)
65-
CAFFE_KNOWN_TYPE(char*)
66-
CAFFE_KNOWN_TYPE(int*)
58+
uint16_t TypeMeta::existingMetaDataIndexForType(TypeIdentifier identifier) {
59+
auto* metaDatas = typeMetaDatas();
60+
const auto end = metaDatas + nextTypeIndex;
61+
// MaxTypeIndex is not very large; linear search should be fine.
62+
auto it = std::find_if(metaDatas, end, [identifier](const auto& metaData) {
63+
return metaData.id_ == identifier;
64+
});
65+
if (it == end) {
66+
return MaxTypeIndex;
67+
}
68+
return static_cast<uint16_t>(it - metaDatas);
69+
}
6770

68-
// For some of the compilers, long is defined separately from int32_t and
69-
// int64_t. As a result we will need to actually define them separately.
70-
// It is recommended that one does NOT use long - use int32_t and int64_t
71-
// explicitly. Explicit long type annotation may go away in the future.
72-
// details: This hack works by defining a _guard_long_unique type, which is
73-
// long iff the compiler has a separate long type and is a dummy type otherwise.
74-
// we then allocate a type id to that _guard_long_unique. If the compiler has a
75-
// separate long type, this allocates a type id for long. Otherwise, it
76-
// allocates a type id for the dummy type, which doesn't matter.
77-
namespace detail {
78-
template <class T>
79-
class _guard_long_unique_dummy final {};
80-
template <class T>
81-
using _guard_long_unique = std::conditional_t<
82-
std::is_same<long, int32_t>::value || std::is_same<long, int64_t>::value,
83-
_guard_long_unique_dummy<T>,
84-
T>;
85-
} // namespace detail
71+
CAFFE_DEFINE_KNOWN_TYPE(std::string)
72+
CAFFE_DEFINE_KNOWN_TYPE(uint16_t)
73+
CAFFE_DEFINE_KNOWN_TYPE(char)
74+
CAFFE_DEFINE_KNOWN_TYPE(std::unique_ptr<std::mutex>)
75+
CAFFE_DEFINE_KNOWN_TYPE(std::unique_ptr<std::atomic<bool>>)
76+
CAFFE_DEFINE_KNOWN_TYPE(std::vector<int32_t>)
77+
CAFFE_DEFINE_KNOWN_TYPE(std::vector<int64_t>)
78+
CAFFE_DEFINE_KNOWN_TYPE(std::vector<unsigned long>)
79+
CAFFE_DEFINE_KNOWN_TYPE(bool*)
80+
CAFFE_DEFINE_KNOWN_TYPE(char*)
81+
CAFFE_DEFINE_KNOWN_TYPE(int*)
8682

87-
CAFFE_KNOWN_TYPE(detail::_guard_long_unique<long>);
88-
CAFFE_KNOWN_TYPE(detail::_guard_long_unique<std::vector<long>>)
83+
CAFFE_DEFINE_KNOWN_TYPE(detail::_guard_long_unique<long>);
84+
CAFFE_DEFINE_KNOWN_TYPE(detail::_guard_long_unique<std::vector<long>>)
8985

90-
CAFFE_KNOWN_TYPE(float*)
91-
CAFFE_KNOWN_TYPE(at::Half*)
86+
CAFFE_DEFINE_KNOWN_TYPE(float*)
87+
CAFFE_DEFINE_KNOWN_TYPE(at::Half*)
9288

9389
} // namespace caffe2

c10/util/typeid.h

Lines changed: 112 additions & 9 deletions
Original file line numberDiff line numberDiff line change
@@ -31,16 +31,19 @@
3131

3232
/*
3333
* TypeIdentifier is a small type containing an id.
34-
* Types must be registered using CAFFE_KNOWN_TYPE() for them to have a type id.
35-
* If a type is registered, you can also create an object containing meta data
36-
* like constructor, destructor, stringified name, ... about the type by calling
37-
* TypeMeta::Make<T>. This returns a TypeMeta() object, which is basically just
38-
* a pointer to the type information, so it's cheap to pass around.
34+
* Types must be registered using CAFFE_DECLARE_KNOWN_TYPE() (in their header)
35+
* and CAFFE_DEFINE_KNOWN_TYPE() (in their .cpp file) for them to have a type
36+
* id. If a type is registered, you can also create an object containing meta
37+
* data like constructor, destructor, stringified name, ... about the type by
38+
* calling TypeMeta::Make<T>. This returns a TypeMeta() object, which is
39+
* basically just a pointer to the type information, so it's cheap to pass
40+
* around.
3941
*/
4042

4143
// TODO: This file is still in the caffe2 namespace, despite living
4244
// in the ATen directory. This is because the macro
43-
// CAFFE_KNOWN_TYPE defines a template specialization, which relies
45+
// CAFFE_KNOWN_TYPE (and CAFFE_DECLARE_KNOWN_TYPE) defines a template
46+
// specialization, which relies
4447
// on the namespace of TypeMeta matching the namespace where the macro is
4548
// called. This requires us to fix all of the call-sites, which I want to do
4649
// later. So the namespace is not fixed at the moment.
@@ -508,12 +511,40 @@ class C10_API TypeMeta final {
508511
#define MaxTypeIndex UINT8_MAX
509512
#endif
510513

511-
static std::atomic<uint16_t> nextTypeIndex;
514+
// Protects type metadata allocation.
515+
// NOLINTNEXTLINE(facebook-hte-NonPodStaticDeclaration)
516+
static std::mutex typeMetaDatasLock;
517+
static uint16_t nextTypeIndex;
512518

513519
static detail::TypeMetaData* typeMetaDatas();
514520

521+
static uint16_t existingMetaDataIndexForType(TypeIdentifier identifier);
522+
523+
#ifdef __CUDACC__
524+
// NOTE [ TypeIdentifier::Get nvcc/clang discrepancy]
525+
// nvcc and clang do not produce identical results for
526+
// TypeIdentifier::Get, because TypeIdentifier::Get relies on
527+
// __PRETTY_FUNCTION__ and they don't agree on the canonical names
528+
// of types (e.g., nvcc normalizes to `short unsigned int`, but clang
529+
// calls it `unsigned short`). Hide the implementation of this function
530+
// from nvcc so that we always use clang (or whatever host C++ compiler)
531+
// for TypeIdentifier::Get.
515532
template <class T>
516-
static uint16_t addTypeMetaData() {
533+
C10_EXPORT static uint16_t addTypeMetaData();
534+
#else
535+
template <class T>
536+
C10_EXPORT static uint16_t addTypeMetaData() {
537+
const auto identifier = TypeIdentifier::Get<T>();
538+
// Need to hold this for the rest of the function, protecting:
539+
// 1) existingMetaDataIndexForType()
540+
// 2) nextTypeIndex++
541+
// 3) the write into typeMetaDatas()
542+
std::lock_guard<std::mutex> lock(typeMetaDatasLock);
543+
// It may exist already if added in a different dynamic shared library.
544+
const uint16_t existing_index = existingMetaDataIndexForType(identifier);
545+
if (existing_index != MaxTypeIndex) {
546+
return existing_index;
547+
}
517548
const uint16_t index = nextTypeIndex++;
518549
TORCH_CHECK(
519550
index <= MaxTypeIndex,
@@ -526,10 +557,11 @@ class C10_API TypeMeta final {
526557
detail::_PickCopy<T>(),
527558
detail::_PickPlacementDelete<T>(),
528559
detail::_PickDelete<T>(),
529-
TypeIdentifier::Get<T>(),
560+
identifier,
530561
c10::util::get_fully_qualified_type_name<T>()};
531562
return index;
532563
}
564+
#endif
533565

534566
// specializations return indexes into typeMetaDataInstances()
535567
template <class T>
@@ -582,6 +614,9 @@ inline std::ostream& operator<<(
582614
* Register unique id for a type so it can be used in TypeMeta context, e.g. be
583615
* used as a type for Blob or for Tensor elements.
584616
*
617+
* CAFFE_KNOWN_TYPE is deprecated; prefer CAFFE_DECLARE_KNOWN_TYPE and
618+
* CAFFE_DEFINE_KNOWN_TYPE.
619+
*
585620
* CAFFE_KNOWN_TYPE does explicit instantiation of TypeIdentifier::Get<T>
586621
* template function and thus needs to be put in a single translation unit (.cpp
587622
* file) for a given type T. Other translation units that use type T as a type
@@ -603,18 +638,86 @@ inline std::ostream& operator<<(
603638
#define EXPORT_IF_NOT_GCC
604639
#endif
605640

641+
// CAFFE_KNOWN_TYPE is deprecated! Use CAFFE_DECLARE_KNOWN_TYPE and
642+
// CAFFE_DEFINE_KNOWN_TYPE instead.
606643
#define CAFFE_KNOWN_TYPE(T) \
644+
template uint16_t TypeMeta::addTypeMetaData<T>(); \
607645
template <> \
608646
EXPORT_IF_NOT_GCC uint16_t TypeMeta::_typeMetaData<T>() noexcept { \
609647
static const uint16_t index = addTypeMetaData<T>(); \
610648
return index; \
611649
}
612650

651+
#define CAFFE_DEFINE_KNOWN_TYPE(T) \
652+
template uint16_t TypeMeta::addTypeMetaData<T>();
653+
654+
// Unlike CAFFE_KNOWN_TYPE, CAFFE_DECLARE_KNOWN_TYPE avoids a function
655+
// call to access _typeMetaData in the common case.
656+
#ifdef __CUDACC__
657+
// nvcc needs its own specialization that doesn't use
658+
// C10_ALWAYS_INLINE so that it doesn't need to see a definition for
659+
// _addTypeMeta. See NOTE [ TypeIdentifier::Get nvcc/clang discrepancy
660+
// ].
661+
#define CAFFE_DECLARE_KNOWN_TYPE(T) \
662+
extern template uint16_t TypeMeta::addTypeMetaData<T>(); \
663+
template <> \
664+
EXPORT_IF_NOT_GCC inline uint16_t TypeMeta::_typeMetaData<T>() noexcept { \
665+
static const uint16_t index = addTypeMetaData<T>(); \
666+
return index; \
667+
}
668+
#else
669+
#define CAFFE_DECLARE_KNOWN_TYPE(T) \
670+
extern template uint16_t TypeMeta::addTypeMetaData<T>(); \
671+
template <> \
672+
EXPORT_IF_NOT_GCC C10_ALWAYS_INLINE uint16_t \
673+
TypeMeta::_typeMetaData<T>() noexcept { \
674+
static const uint16_t index = addTypeMetaData<T>(); \
675+
return index; \
676+
}
677+
#endif
678+
613679
#define CAFFE_KNOWN_TYPE_NOEXPORT(T) \
614680
template <> \
615681
uint16_t TypeMeta::_typeMetaData<T>() noexcept { \
616682
static const uint16_t index = addTypeMetaData<T>(); \
617683
return index; \
618684
}
619685

686+
CAFFE_DECLARE_KNOWN_TYPE(std::string)
687+
CAFFE_DECLARE_KNOWN_TYPE(uint16_t)
688+
CAFFE_DECLARE_KNOWN_TYPE(char)
689+
CAFFE_DECLARE_KNOWN_TYPE(std::unique_ptr<std::mutex>)
690+
CAFFE_DECLARE_KNOWN_TYPE(std::unique_ptr<std::atomic<bool>>)
691+
CAFFE_DECLARE_KNOWN_TYPE(std::vector<int32_t>)
692+
CAFFE_DECLARE_KNOWN_TYPE(std::vector<int64_t>)
693+
CAFFE_DECLARE_KNOWN_TYPE(std::vector<unsigned long>)
694+
CAFFE_DECLARE_KNOWN_TYPE(bool*)
695+
CAFFE_DECLARE_KNOWN_TYPE(char*)
696+
CAFFE_DECLARE_KNOWN_TYPE(int*)
697+
698+
// For some of the compilers, long is defined separately from int32_t and
699+
// int64_t. As a result we will need to actually define them separately.
700+
// It is recommended that one does NOT use long - use int32_t and int64_t
701+
// explicitly. Explicit long type annotation may go away in the future.
702+
// details: This hack works by defining a _guard_long_unique type, which is
703+
// long iff the compiler has a separate long type and is a dummy type otherwise.
704+
// we then allocate a type id to that _guard_long_unique. If the compiler has a
705+
// separate long type, this allocates a type id for long. Otherwise, it
706+
// allocates a type id for the dummy type, which doesn't matter.
707+
namespace detail {
708+
template <class T>
709+
class _guard_long_unique_dummy final {};
710+
template <class T>
711+
using _guard_long_unique = std::conditional_t<
712+
std::is_same<long, int32_t>::value || std::is_same<long, int64_t>::value,
713+
_guard_long_unique_dummy<T>,
714+
T>;
715+
} // namespace detail
716+
717+
CAFFE_DECLARE_KNOWN_TYPE(detail::_guard_long_unique<long>);
718+
CAFFE_DECLARE_KNOWN_TYPE(detail::_guard_long_unique<std::vector<long>>)
719+
720+
CAFFE_DECLARE_KNOWN_TYPE(float*)
721+
CAFFE_DECLARE_KNOWN_TYPE(at::Half*)
722+
620723
} // namespace caffe2

caffe2/core/tensor.cc

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -11,7 +11,7 @@
1111

1212
namespace caffe2 {
1313

14-
CAFFE_KNOWN_TYPE(Tensor);
14+
CAFFE_DEFINE_KNOWN_TYPE(Tensor);
1515

1616
TensorPrinter::TensorPrinter(
1717
// NOLINTNEXTLINE(modernize-pass-by-value)

caffe2/core/tensor.h

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -662,6 +662,7 @@ void TensorPrinter::Print(const Tensor& tensor) {
662662
}
663663
}
664664

665+
CAFFE_DECLARE_KNOWN_TYPE(Tensor)
665666
} // namespace caffe2
666667

667668
C10_CLANG_DIAGNOSTIC_POP()

0 commit comments

Comments
 (0)