Added new kernel 4ic_deinterleave_8i_x2 with generic #399

dkozel · 2020-08-04T21:35:57Z

Here's the generic implementation of a new 4bit signed integer deinterleave kernel. I'll be adding an SSE2 kernel shortly.

I'm interested in feedback on the code arrangement and implementation style, this is my first VOLK modification.

dkozel · 2020-08-04T23:27:16Z

Clearly needs formatting. There appears to also be an issue with the input or output types which doesn't surprise me. Any feedback/advice welcome. I'll test for correctness when I can, preferably after sorting out the types issue.

jdemel

Thanks for your PR. I put in a lot of comments. I hope that'll help.

kernels/volk/volk_4ic_deinterleave_8i_x2.h

jdemel · 2020-08-05T08:21:32Z

kernels/volk/volk_4ic_deinterleave_8i_x2.h

+ * \b Example
+ * \code
+ * int N = 10000;
+ *
+ * volk_4ic_deinterleave_8i_x2();
+ *
+ * volk_free(x);
+ * \endcode


An example with your use case in mind that describes input and output would be great.

kernels/volk/volk_4ic_deinterleave_8i_x2.h

Implemented generic and SSE2 protokernels

dkozel · 2020-08-05T22:33:02Z

There's some const char * error at runtime. I'm not sure what the cause is. I'll have time to debug on the weekend. I changed the current SSE2 function to unaligned, I'll work on an aligned one with the "instrinsic intrinsic" setup once I have the first two kernels working.

odrisci · 2020-09-17T19:18:06Z

The problem is that the qa subsystem does not handle sub-byte types: it sees the size of the input type as zero bytes and allocates a zero byte array to store it.

I think it might be easiest to rename the function to something like volk_8i_deinterleave_nibbles_8i_x2 with signature:

void volk_8i_deinterleave_nibbles_8i_x2( int8_t * lsn, int8_t * msn, const int8_t *in, unsigned int num_samples);

Where lsn and msn are the least and most significant nibbles respectively. I think this interface would be better as it avoids making implicit assumptions about which nibble is I and which is Q

jdemel · 2020-09-18T07:51:58Z

@odrisci thanks for pointing this out! Unfortunately, our system to deduce IO is fragile.

jdemel · 2020-09-18T07:55:57Z

kernels/volk/volk_4ic_deinterleave_8i_x2.h

+    for (unsigned int i = 0; i < num_points; i++) {
+        *iBufferPtr++ = (int8_t)(*complexVectorPtr) >> 4;
+        *qBufferPtr++ = ((int8_t)(*complexVectorPtr++) << 4) >> 4;
+    }
+}


I don't know if a bitmask 0x0f or bit shifts should be preferred but intuitively I'd prefer bitmasks because they seem to be more explicit.

The output is a signed type but the current implementation would never return a negative value. Is that intended? I assume no because the SSE impl takes explicit case of the sign bit.

Unfortunately bitshifting negative numbers is undefined behaviour (or implementation defined, I can't remember) though most compilers will do what you expect.

The other problem here is that C/C++ only define integer arithmetic over int types, so smaller types are up-cast. So technically the result of

(int8_t)(*complexVectorPtr++) <<4

is of type int.

I would rewrite this as either:

for (unsigned int i = 0; i < num_points; i++) { *iBufferPtr++ = (*complexVectorPtr >> 4); *qBufferPtr++ = (int8_t)((*complexVectorPtr++) << 4) >> 4; }

if you are happy with the undefined behaviour, or:

for (unsigned int i = 0; i < num_points; i++) { *iBufferPtr++ = (*complexVectorPtr) /16; *qBufferPtr++ = (int8_t)((*complexVectorPtr++) *16) /16; }

if you are not. The latter might not optimise as well as the former, but is guaranteed to be correct. Note that I've left the last cast to int8_t as implicit in each case - I'm not sure if this agrees with your coding style.

Just saw the use case @dkozel was working on, and both nibbles would have a sign bit. I wonder if this may be the correct way. That would put the signed bits where they need to be then scale the values down appropriately?

for (unsigned int i = 0; i < num_points; i++) {
*iBufferPtr++ = (*complexVectorPtr & 0xF0)/16;
*qBufferPtr++ = (int8_t)((*complexVectorPtr++) << 4)/16;
}

odrisci · 2020-09-18T09:42:55Z

@jdemel No problem. I don't think its too fragile - just if you want to handle packed bits you will probably need to think about it and plan for that. For example, here we are looking at a datatype 4ic which is two 4-bit integers (a real and a complex) packed into a signed byte. But it is not obvious: a) which is the real and which is the imaginary; b) what is the encoding of the 4 bits.

We have done some work in the GNSS community on a metadata format for GNSS IF data logs. Most GNSS front-ends use only 1 to 4 bits per sample and pack multiple samples from multiple frequency bands into words of 1, 2 or 4 bytes. Essentially we have seen almost every possible combination of bit and byte ordering (big endian bytes in the word with little endian arrangement of samples within a byte for example!)

The metadata standard was recently approved by the Institute of Navigation in the US and can be found here if you want to have a look: https://sdr.ion.org/

Anyway, that's why I recommend using the deinterleave_nibbles name - you get the functionality you need without opening the sub-byte samples can of worms

dkozel · 2021-06-05T12:17:43Z

I wasn't able to figure out the best path forward here, and the immediate use has been worked around separately. I'm going to close and will open a new PR if I ever loop back around to this. Thanks all for the feedback and comments.

dkozel added wip work in progress Don't Merge Please don't merge just yet. Verify what's going on first. labels Aug 4, 2020

dkozel force-pushed the 4ic_deinterleave_8i_x2 branch from 5a1af82 to 0bbdeb0 Compare August 4, 2020 23:22

jdemel reviewed Aug 5, 2020

View reviewed changes

Added new kernel 4ic_deinterleave_8i_x2

5e45b61

Implemented generic and SSE2 protokernels

dkozel force-pushed the 4ic_deinterleave_8i_x2 branch from 0bbdeb0 to 5e45b61 Compare August 5, 2020 21:49

jdemel reviewed Sep 18, 2020

View reviewed changes

jdemel linked an issue Oct 21, 2020 that may be closed by this pull request

Add 4ic deinterleave to 8i x2 #398

Open

dkozel closed this Jun 5, 2021

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Added new kernel 4ic_deinterleave_8i_x2 with generic #399

Added new kernel 4ic_deinterleave_8i_x2 with generic #399

dkozel commented Aug 4, 2020

dkozel commented Aug 4, 2020

jdemel left a comment

jdemel Aug 5, 2020

dkozel commented Aug 5, 2020

odrisci commented Sep 17, 2020

jdemel commented Sep 18, 2020

jdemel Sep 18, 2020 •

edited

Loading

odrisci Sep 18, 2020

ghostop14 Oct 2, 2020

odrisci commented Sep 18, 2020

dkozel commented Jun 5, 2021

Added new kernel 4ic_deinterleave_8i_x2 with generic #399

Added new kernel 4ic_deinterleave_8i_x2 with generic #399

Conversation

dkozel commented Aug 4, 2020

dkozel commented Aug 4, 2020

jdemel left a comment

Choose a reason for hiding this comment

jdemel Aug 5, 2020

Choose a reason for hiding this comment

dkozel commented Aug 5, 2020

odrisci commented Sep 17, 2020

jdemel commented Sep 18, 2020

jdemel Sep 18, 2020 • edited Loading

Choose a reason for hiding this comment

odrisci Sep 18, 2020

Choose a reason for hiding this comment

ghostop14 Oct 2, 2020

Choose a reason for hiding this comment

odrisci commented Sep 18, 2020

dkozel commented Jun 5, 2021

jdemel Sep 18, 2020 •

edited

Loading