Ok, I'm sure this has been beaten to death, but google, etc. found a
lot of descriptions of the problem but none of a portable solution.
I'm working with some firmware drivers which are intended to be as
portable as possible. Data moves thru a switchable 8- or 16-bit data
bus chip (a USB device controller specifically). Performance is
critical so 16-bit is pretty much necessary. Following that example,
let's look at the USB mass storage class. You get commands from the
host in 31 octet command wrappers that look like this (endian issues
aside...):
typedef struct
{
u32 Signature;
u32 Tag;
u32 TransferLength;
u8 Flags;
u8 Lun;
u8 CommandLength;
u8 Command[15];
} Cbw;
If I have 8 bit data types that's easy enough to get and deal with.
But right now I'm working with a TMS320C55x variant with nothing
smaller than 16-bit data types. So naturally the 8 bit types get all
mixed up when I read them and when I send back similar data every
other octet is garbage. Some responses are filled at runtime, a few
are global constants. I can pack things early, but then I need to
unpack, modify, and repack. Or I can pack before transmission, but
that'd take a bite out of performance. Or I can break things down:
typedef struct
{
BYTE Signature0;
BYTE Signature1;
BYTE Signature2;
BYTE Signature3;
BYTE Tag0;
BYTE Tag1;
BYTE Tag2;
BYTE Tag3;
BYTE TransferLength0;
BYTE TransferLength1;
BYTE TransferLength2;
BYTE TransferLength3;
BYTE Flags;
BYTE Lun;
BYTE CommandLength;
BYTE Command[15];
} Cbw;
Ugly. I'd really like to avoid that...
Now, I see this problem described countless times (yes, yes,
sizeof(char)==sizeof(int)==1, 16 bit byte is 100% ok by the standard),
but what's the best portable solution to dealing with this? Or at
least *mostly* portable. All the messages I see say "don't store
binary data and don't worry about how many bits are in anything".
Great, but that embedded command field being sent from my host
computer 5 meters away is 15 octets whether I like it or not. I don't
care if everything's stored locally inefficiently so long as
performance is reasonable (and it's clear! Other people *will* be
dealing with this code!)
I'm making progress getting things to work, but it's getting ugly so I
was curious how people deal with this in real life.
Thanks for whatever guidance you can provide,
alex
Octets with non-8 bit bytes...
Started by ●June 10, 2004
Reply by ●June 10, 20042004-06-10
Reply by ●June 10, 20042004-06-10
On 10 Jun 2004 16:59:49 -0700, usenet1@sanks.net (Alex Sanks) wrote in comp.arch.embedded:> Ok, I'm sure this has been beaten to death, but google, etc. found a > lot of descriptions of the problem but none of a portable solution. > > I'm working with some firmware drivers which are intended to be as > portable as possible. Data moves thru a switchable 8- or 16-bit data > bus chip (a USB device controller specifically). Performance is > critical so 16-bit is pretty much necessary. Following that example, > let's look at the USB mass storage class. You get commands from the > host in 31 octet command wrappers that look like this (endian issues > aside...): > > typedef struct > { > u32 Signature; > u32 Tag; > u32 TransferLength; > u8 Flags; > u8 Lun; > u8 CommandLength; > u8 Command[15]; > } Cbw; > > If I have 8 bit data types that's easy enough to get and deal with. > But right now I'm working with a TMS320C55x variant with nothing > smaller than 16-bit data types. So naturally the 8 bit types get all > mixed up when I read them and when I send back similar data every > other octet is garbage. Some responses are filled at runtime, a few > are global constants. I can pack things early, but then I need to > unpack, modify, and repack. Or I can pack before transmission, but > that'd take a bite out of performance. Or I can break things down: > > typedef struct > { > BYTE Signature0; > BYTE Signature1; > BYTE Signature2; > BYTE Signature3; > BYTE Tag0; > BYTE Tag1; > BYTE Tag2; > BYTE Tag3; > BYTE TransferLength0; > BYTE TransferLength1; > BYTE TransferLength2; > BYTE TransferLength3; > BYTE Flags; > BYTE Lun; > BYTE CommandLength; > BYTE Command[15]; > } Cbw; > > Ugly. I'd really like to avoid that... > > Now, I see this problem described countless times (yes, yes, > sizeof(char)==sizeof(int)==1, 16 bit byte is 100% ok by the standard), > but what's the best portable solution to dealing with this? Or at > least *mostly* portable. All the messages I see say "don't store > binary data and don't worry about how many bits are in anything". > Great, but that embedded command field being sent from my host > computer 5 meters away is 15 octets whether I like it or not. I don't > care if everything's stored locally inefficiently so long as > performance is reasonable (and it's clear! Other people *will* be > dealing with this code!) > > I'm making progress getting things to work, but it's getting ugly so I > was curious how people deal with this in real life. > > Thanks for whatever guidance you can provide, > alexI ran across something similar in parsing and formatting CAN packets for the TI 2812 DSP, which likewise has 16-bit chars and ints. A CAN packet may contain between 0 and 8 octets in the data field of the frame. In our interface, any octet may be part of an 8-bit, 16-bit, or 32-bit value. I wrote two low-level routines to pack/unpack to an array of eight 1-bit words. When compiled with full optimization it is quite short and fast, at least on the 2812, which has a C-friendly architecture compared to some older DSPs. The result was good enough that I had no need to write it in assembly language. In fact one of my colleagues who wrote the other side of the interface on an ARM used the code unchanged. You might be able to adapt something from them: #define OCTET_MASK 0xFFU static void split_frame(const uint16_t words [4], uint_least8_t *split) { /* can't just walk a pointer to unsigned char through the octets of the */ /* data frame because unsigned char is 16 bits on the 2812 DSP! */ split [0] = words[0] & OCTET_MASK; split [1] = (words[0] >> 8) & OCTET_MASK; split [2] = words[1] & OCTET_MASK; split [3] = (words[1] >> 8) & OCTET_MASK; split [4] = words[2] & OCTET_MASK; split [5] = (words[2] >> 8) & OCTET_MASK; split [6] = words[3] & OCTET_MASK; split [7] = (words[3] >> 8) & OCTET_MASK; } static void assemble_frame(const uint_least8_t *split, uint16_t *words) { /* can't just walk a pointer to unsigned char through the octets of the */ /* data frame because unsigned char is 16 bits on the 2812 DSP! */ words [0] = ((uint16_t)split [1] << 8) | split [0]; words [1] = ((uint16_t)split [3] << 8) | split [2]; words [2] = ((uint16_t)split [5] << 8) | split [4]; words [3] = ((uint16_t)split [7] << 8) | split [6]; } Note that TI doesn't supply a C99 <stdint.h> header with Code Composer Studio for the 2812, I had to write my own. On mine for the TI, the C99 type uint_least_8_t is a typedef for unsigned int. On the ARM compiler, which does supply a <stdint.h>, it is unsigned char. These things can be done in C in a portable way, it just takes a little thought. -- Jack Klein Home: http://JK-Technology.Com FAQs for comp.lang.c http://www.eskimo.com/~scs/C-faq/top.html comp.lang.c++ http://www.parashift.com/c++-faq-lite/ alt.comp.lang.learn.c-c++ http://www.contrib.andrew.cmu.edu/~ajo/docs/FAQ-acllc.html
Reply by ●June 11, 20042004-06-11
Please excuse as I can give no "whatever guidance".
But another question to you:
Can you tell me where to get information about the USB mass storage class ?
Thanks, Wolfgang
Reply by ●June 11, 20042004-06-11
In comp.arch.embedded Alex Sanks <usenet1@sanks.net> wrote:> I'm working with some firmware drivers which are intended to be as > portable as possible. Data moves thru a switchable 8- or 16-bit > data bus chip (a USB device controller specifically).What the data bus of that chip is should be pretty much irrelevant. What you need to know is what size the registers are. Or more generally, how that 16-bit layout actually works. The makers of that USB controller *must* be aware of this problem, so check them for app notes.> But right now I'm working with a TMS320C55x variant with nothing > smaller than 16-bit data types. So naturally the 8 bit types get all > mixed up when I read them and when I send back similar data every > other octet is garbage.So don't do that. Marshal your incoming data into something your CPU can use (e.g. one 16-bit word for each octet, let 32bit words keep 32bit words, and forget about possible waste), right at the interface betwen the USB controller and the DSP.> Ugly. I'd really like to avoid that...You won't manage to avoid all the ugliness --- you've maneouvered yourself into too ugly a situation for that.> but what's the best portable solution to dealing with this?Essentially the same one you use to work with single bits in a C byte: masks and shifts. Or, only if you know your compiler will _never_ change its behaviour in that aspect, bit-fields. -- Hans-Bernhard Broeker (broeker@physik.rwth-aachen.de) Even if all the snow were burnt, ashes would remain.
Reply by ●June 11, 20042004-06-11
On Fri, 11 Jun 2004 08:26:45 +0200, Wolfgang <never@nowhere.com> wrote:> Can you tell me where to get information about the USB mass storage > class ?USB.org has a good collection of documents, including class specs. http://www.usb.org/developers/devclass/ HTH, Vadim
Reply by ●June 12, 20042004-06-12
Guy Macon <http://www.guymacon.com> wrote in message news:<10chvgq33qr2v00@corp.supernews.com>...> "Octets" and "Bytes" are always 8 bits. The term you want is "Words."I can't imagine Octet meaning anything but 8 bits. However, over time, Byte has been used in many ways. There were a lot of 7 bit bytes at one time - a chunk big enough to hold an ASCII character in machines not measured in powers of 2. Word is almost certainly not the right term here. That usually implies the natural length of data elements on the machine - 16 bit words on 16 bit machines, 36 bit words on 36 bit machines, etc. Gee ain't terminology a pain :-) Regards, Steve
Reply by ●June 12, 20042004-06-12
On 10 Jun 2004 16:59:49 -0700, usenet1@sanks.net (Alex Sanks) wrote:>Ok, I'm sure this has been beaten to death, but google, etc. found a >lot of descriptions of the problem but none of a portable solution. > >I'm working with some firmware drivers which are intended to be as >portable as possible. Data moves thru a switchable 8- or 16-bit data >bus chip (a USB device controller specifically). Performance is >critical so 16-bit is pretty much necessary. Following that example, >let's look at the USB mass storage class. You get commands from the >host in 31 octet command wrappers that look like this (endian issues >aside...): > >typedef struct >{ > u32 Signature; > u32 Tag; > u32 TransferLength; > u8 Flags; > u8 Lun; > u8 CommandLength; > u8 Command[15]; >} Cbw; > >If I have 8 bit data types that's easy enough to get and deal with. >But right now I'm working with a TMS320C55x variant with nothing >smaller than 16-bit data types. So naturally the 8 bit types get all >mixed up when I read them and when I send back similar data every >other octet is garbage. Some responses are filled at runtime, a few >are global constants. I can pack things early, but then I need to >unpack, modify, and repack. Or I can pack before transmission, but >that'd take a bite out of performance. Or I can break things down: >[ Stuff snipped] One portable way I have seen to force C to use a specific size is the following: typedef struct {unsigned data:8;} BYTE; The only problem is that if you define a variable as BYTE foo; you have to use it with following syntax foo.data=20; At least the compiler will have to do the masking etc. if necessary. Most reasonable compilers should generate the same code for a normal unsigned char variable and this type if it happens to be the same width. Regards Anton Erasmus
Reply by ●June 13, 20042004-06-13
On 10 Jun 2004 16:59:49 -0700, usenet1@sanks.net (Alex Sanks) wrote:> >I'm working with some firmware drivers which are intended to be as >portable as possible. Data moves thru a switchable 8- or 16-bit data >bus chip (a USB device controller specifically). Performance is >critical so 16-bit is pretty much necessary. Following that example, >let's look at the USB mass storage class. You get commands from the >host in 31 octet command wrappers that look like this (endian issues >aside...):I do not see much point in this discussion, unless you also consider the endian issues if you really want the code to be as portable as possible. If the endianess of the message and your hardware does not match, you sooner or later have to isolate the bytes and rearrange them, so why not do it immediately when assembling or disassembling the data frame. Postponing the endianess handling to a later stage only makes sense if the hardware contains a nice byte swap instruction (or an 8 bit rotate instruction on a 16 bit register), to be included with an in-line assembly statement, but this is not very portable :-). Paul
Reply by ●June 14, 20042004-06-14
Anton Erasmus wrote:> One portable way I have seen to force C to use a specific size is the > following: > > typedef struct {unsigned data:8;} BYTE; > > The only problem is that if you define a variable as > > BYTE foo; > > you have to use it with following syntax > > foo.data=20;Unfortunately, that's not the only problem. The C standard does not dictate the order of bits in a bit field. The compiler is at liberty to put the first N bits in the ls-bits or the ms-bits, so this: typedef struct { byte_a : 8; byte_b : 8; } Can have byte_a in the upper 8 bits or in the lower 8-bits. If you're exchanging this data between two machines that use two different compilers, you have to make sure they speak the same bit field lingo, or you have to jump through hoops to re-define them based on how they get packed. I think "portable" and "jump through hoops" could be considered mutually exclusive. -- Jim Thomas Principal Applications Engineer Bittware, Inc jthomas@bittware.com http://www.bittware.com (703) 779-7770 A pessimist is an optimist with experience






