Tauno Voipio <tauno.voipio@notused.fi.invalid> wrote:
(snip)

> The whole idea of separate text and binary files have crept to
> pure C as afterthoughts because of e.g. CP/M and its descendant
> MS-DOS. The C I/O was originally built for and with Unix. The
> Unix-like systems do not separate binary and text files, but
> most implemenations are polite enough to tolerate the 'b'
> specifier.

> The original CP/M file system counted only reservation blocks
> of 128 bytes each. The size comes from the original 2002 block
> 8 inch single-density diskettes, which were the media for the
> original CP/M. (My first Z80 computer run CP/M 1.3, aeons ago).

Well, it gets more interesting on IBM mainframe systems.
IBM doesn't use a record termination character.  The two popular
formats are FB (Fixed length blocked records, where the record
length is known and constant), and VB (variable length blocked
records witha a four byte block descriptor at the beginning of
each block, and a four byte record descriptor at the beginning
of each record.)   I believe for text files it adds/removes
'\n' (which might be other than X'0A' in EBCDIC).  For non-text
files, I believe by default it doesn't give the BDW and RDW, but
you can get those as an option.  

It gets more interesting with fseek() and ftell().  The file
system keeps track of blocks and tracks, but not bytes.  
The system can easily seek to a block and offset within a
block, but not to a byte offset.  For some systems, ftell()
returns 32768*(block number)+(block offset), and fseek()
accepts those values.   Read the C standard description of
fseek() and ftell() for text files.

-- glen

On 27.2.10 7:16 , Stefan Reuther wrote:
> Randy Yates wrote:
>> Now that I've sufficiently beaten myself up, I'm beginning to wonder if
>> I really made a mistake.
>>
>> K&R (2e) doesn't really state in their file I/O section (appendix B) how
>> the non-"b" mode interacts with fread().  And fread() states, without
>> any special cases other than that it may not return the full amount of
>> data requested (the implication being that if there is no more data in
>> the file you won't get what you asked for), that it will read "nobj"s of
>> size "size".  In fact, I can remember thinking in the back of my mind
>> when I coded the fopen() that "b" mode only matters for character-based
>> functions like the scanf functions.
>
> The C standard defines two types of streams, binary and text, and some
> constraints on them. The compiler/library decides how to map those to a
> sequence of chars. It doesn't matter how you access that sequence of
> chars. For example, in C++ there's the common misconception that "only
> std::endl knows how to make a new line". It doesn't. std::endl just
> writes a '\n' character (and does a flush). The underlying library
> converts that into an operating-system-defined newline. The same goes
> for C's printf. printf writes a '\n'. The underlying library converts.
>
>
>    Stefan

The whole idea of separate text and binary files have crept to
pure C as afterthoughts because of e.g. CP/M and its descendant
MS-DOS. The C I/O was originally built for and with Unix. The
Unix-like systems do not separate binary and text files, but
most implemenations are polite enough to tolerate the 'b'
specifier.

The original CP/M file system counted only reservation blocks
of 128 bytes each. The size comes from the original 2002 block
8 inch single-density diskettes, which were the media for the
original CP/M. (My first Z80 computer run CP/M 1.3, aeons ago).

-- 

Tauno Voipio

Randy Yates <yates@ieee.org> wrote:
> glen herrmannsfeldt <gah@ugcs.caltech.edu> writes:
(snip)
>> The X'1A' is left over from CP/M, where the directory counts
>> blocks but not bytes.  Text files indicate the end of file with X'1A'.

>> Somehow that tradition was kept with MSDOS, even though the file
>> system does keep the file length in bytes.  There was no need for
>> it, but they did it anyway.  That is a separate question from the
>> use of control-Z for terminating terminal/console input.  Unix
>> uses control-D for that, but doesn't terminate file input 
>> when it finds a control-D.
 
> This is some interesting history, glen. 
 
> Now that I've sufficiently beaten myself up, I'm beginning to 
> wonder if I really made a mistake.
 
> K&R (2e) doesn't really state in their file I/O section (appendix B) how
> the non-"b" mode interacts with fread().  And fread() states, without
> any special cases other than that it may not return the full amount of
> data requested (the implication being that if there is no more data in
> the file you won't get what you asked for), that it will read "nobj"s of
> size "size".  In fact, I can remember thinking in the back of my mind
> when I coded the fopen() that "b" mode only matters for character-based
> functions like the scanf functions.

Yes, it is all system dependent.  It will do CRLF --> '\n' conversion
even for fread() and fwrite(), assuming your file system uses CRLF.

Reading tapes under Unix/unix-like systems with read(), you get
one tape block per read() call.  I forget if that is also true
for fread().  

Does your system keep the length of files in bytes or blocks?

-- glen

Randy Yates wrote:
> glen herrmannsfeldt <gah@ugcs.caltech.edu> writes:
> 
>> Randy Yates <yates@ieee.org> wrote:
>> (snip)
>>
>>>>> Someone on the TI forum determined the problem: I wasn't specifying
>>>>> the "b" option in fopen. fopen(file, "rb")
>> (snip)
>>  
>>> There is a 0x1a near there, yes. 
>>  
>>> I do feel a bit foolish. I've known about the "b" flag for decades -
>>> just got careless. But also, TI's C file I/O has a history of
>>> squirrelishness (sp?), so I was suspecting that type of nonsense.
>>> Additionally, the exact same code ran fine when retargeted for
>>> pc/cygwin. 
>> It is pretty strange, overall.
>>
>> You should expect problems with X'OD' and/or X'0A', as those
>> are the line terminator characters.
>>
>> Cygwin is supposed to work like unix, so better do I/O like unix.
>>
>> The X'1A' is left over from CP/M, where the directory counts
>> blocks but not bytes.  Text files indicate the end of file with X'1A'.
>>
>> Somehow that tradition was kept with MSDOS, even though the file
>> system does keep the file length in bytes.  There was no need for
>> it, but they did it anyway.  That is a separate question from the
>> use of control-Z for terminating terminal/console input.  Unix
>> uses control-D for that, but doesn't terminate file input 
>> when it finds a control-D.
> 
> This is some interesting history, glen. 
> 
> Now that I've sufficiently beaten myself up, I'm beginning to wonder if
> I really made a mistake.
> 
> K&R (2e) doesn't really state in their file I/O section (appendix B) how
> the non-"b" mode interacts with fread().  And fread() states, without
> any special cases other than that it may not return the full amount of
> data requested (the implication being that if there is no more data in
> the file you won't get what you asked for), that it will read "nobj"s of
> size "size".  In fact, I can remember thinking in the back of my mind
> when I coded the fopen() that "b" mode only matters for character-based
> functions like the scanf functions.

Just for the record, those codes have standard names. 0x4 (^D) is EOT; 
end of transmission. 0x3 (^C) is ETX; end of text. MS-DOS stayed with 
tradition by using that to end the session when typing from the keyboard 
to a file, although I would have chosen 0x1C FS; file separator.

Jerry
-- 
Leopold Kronecker on mathematics:
        God created the integers, all else is the work of man.
&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;

Randy Yates wrote:
> Now that I've sufficiently beaten myself up, I'm beginning to wonder if
> I really made a mistake.
> 
> K&R (2e) doesn't really state in their file I/O section (appendix B) how
> the non-"b" mode interacts with fread().  And fread() states, without
> any special cases other than that it may not return the full amount of
> data requested (the implication being that if there is no more data in
> the file you won't get what you asked for), that it will read "nobj"s of
> size "size".  In fact, I can remember thinking in the back of my mind
> when I coded the fopen() that "b" mode only matters for character-based
> functions like the scanf functions.

The C standard defines two types of streams, binary and text, and some
constraints on them. The compiler/library decides how to map those to a
sequence of chars. It doesn't matter how you access that sequence of
chars. For example, in C++ there's the common misconception that "only
std::endl knows how to make a new line". It doesn't. std::endl just
writes a '\n' character (and does a flush). The underlying library
converts that into an operating-system-defined newline. The same goes
for C's printf. printf writes a '\n'. The underlying library converts.

  Stefan

glen herrmannsfeldt <gah@ugcs.caltech.edu> writes:

> Randy Yates <yates@ieee.org> wrote:
> (snip)
>
>>>> Someone on the TI forum determined the problem: I wasn't specifying
>>>> the "b" option in fopen. fopen(file, "rb")
> (snip)
>  
>> There is a 0x1a near there, yes. 
>  
>> I do feel a bit foolish. I've known about the "b" flag for decades -
>> just got careless. But also, TI's C file I/O has a history of
>> squirrelishness (sp?), so I was suspecting that type of nonsense.
>> Additionally, the exact same code ran fine when retargeted for
>> pc/cygwin. 
>
> It is pretty strange, overall.
>
> You should expect problems with X'OD' and/or X'0A', as those
> are the line terminator characters.
>
> Cygwin is supposed to work like unix, so better do I/O like unix.
>
> The X'1A' is left over from CP/M, where the directory counts
> blocks but not bytes.  Text files indicate the end of file with X'1A'.
>
> Somehow that tradition was kept with MSDOS, even though the file
> system does keep the file length in bytes.  There was no need for
> it, but they did it anyway.  That is a separate question from the
> use of control-Z for terminating terminal/console input.  Unix
> uses control-D for that, but doesn't terminate file input 
> when it finds a control-D.

This is some interesting history, glen. 

Now that I've sufficiently beaten myself up, I'm beginning to wonder if
I really made a mistake.

K&R (2e) doesn't really state in their file I/O section (appendix B) how
the non-"b" mode interacts with fread().  And fread() states, without
any special cases other than that it may not return the full amount of
data requested (the implication being that if there is no more data in
the file you won't get what you asked for), that it will read "nobj"s of
size "size".  In fact, I can remember thinking in the back of my mind
when I coded the fopen() that "b" mode only matters for character-based
functions like the scanf functions.
-- 
Randy Yates                      % "Midnight, on the water... 
Digital Signal Labs              %  I saw...  the ocean's daughter." 
mailto://yates@ieee.org          % 'Can't Get It Out Of My Head' 
http://www.digitalsignallabs.com % *El Dorado*, Electric Light Orchestra

Randy Yates <yates@ieee.org> wrote:
(snip)

>>> Someone on the TI forum determined the problem: I wasn't specifying
>>> the "b" option in fopen. fopen(file, "rb")
(snip)

> There is a 0x1a near there, yes. 

> I do feel a bit foolish. I've known about the "b" flag for decades -
> just got careless. But also, TI's C file I/O has a history of
> squirrelishness (sp?), so I was suspecting that type of nonsense.
> Additionally, the exact same code ran fine when retargeted for
> pc/cygwin. 

It is pretty strange, overall.

You should expect problems with X'OD' and/or X'0A', as those
are the line terminator characters.

Cygwin is supposed to work like unix, so better do I/O like unix.

The X'1A' is left over from CP/M, where the directory counts
blocks but not bytes.  Text files indicate the end of file with X'1A'.

Somehow that tradition was kept with MSDOS, even though the file
system does keep the file length in bytes.  There was no need for
it, but they did it anyway.  That is a separate question from the
use of control-Z for terminating terminal/console input.  Unix
uses control-D for that, but doesn't terminate file input 
when it finds a control-D.

-- glen

Randy Yates wrote:
> Vladimir Vassilevsky <nospam@nowhere.com> writes:
>> [...]
>> However what is good about chasing such silly problem is that you
>> probably fixed a dozen of suspicious places elsewhere, and now you can
>> cite the documentation by heart.
> 
> It will be a warm winter in Greenland before I forget the blasted "b" in
> an fopen() again!

That time may arrive all too soon! :-) This snow we've been having is a 
direct consequence of warm Pacific Ocean weather. BTW, Microsoft 
inherited that need from CP-M.

Jerry
-- 
Leopold Kronecker on mathematics:
        God created the integers, all else is the work of man.
&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;

Vladimir Vassilevsky <nospam@nowhere.com> writes:
> [...]
> However what is good about chasing such silly problem is that you
> probably fixed a dozen of suspicious places elsewhere, and now you can
> cite the documentation by heart.

It will be a warm winter in Greenland before I forget the blasted "b" in
an fopen() again!
-- 
Randy Yates                      % "Ticket to the moon, flight leaves here today 
Digital Signal Labs              %  from Satellite 2"
mailto://yates@ieee.org          % 'Ticket To The Moon' 
http://www.digitalsignallabs.com % *Time*, Electric Light Orchestra


Randy Yates wrote:

> Tauno Voipio <tauno.voipio@notused.fi.invalid> writes:
> 
> 
>>On 26.2.10 7:14 , Randy Yates wrote:
>>
>>>On Feb 26, 11:39 am, Randy Yates<ya...@ieee.org>  wrote:
>>>
>>>>I'm using CCS 3.3.82.13 along with the cgtools 6.1.13. the target is
>>>>the CPU cycle-accurate 64xx simulator.
>>>>
>>>>I'm building an application to read some test vectors into the DSP
>>>>from the PC filesystem using the C file I/O system (fopen, fread,
>>>>fwrite, fclose, feof).The application is a simple C application (no
>>>>DSP/BIOS) running in the 512kB core SRAM.
>>>>
>>>>The problem is in the fread()/feof() functions. I'm first reading a
>>>>few bytes of header information with fread(), then I begin reading 16-
>>>>bit samples in chunks of BUFFER_LENGTH. The header fread()s work
>>>>perfectly. However, after 50 or so bytes of samples are read (the
>>>>exact number changes somewhat), fread() begins returning 0 bytes read
>>>>and feof() returns true, even though the actual file size is about 32k
>>>>bytes.
>>>>
>>>>Here are the things I've tried:
>>>>
>>>>   1. Varying the BUFFER_LENGTH from 1 to 16 to 128 to 256 to 1024 - no
>>>>change.
>>>>
>>>>    2. Increasing the heap (.sysmem) and stack (.stack) to 128k bytes
>>>>each. no change.
>>>>
>>>>   3. Creating a linker command file and explicitly placing all the
>>>>sections.
>>>>
>>>>   4. Aligning the .cio section to a 256-byte boundary.
>>>>
>>>>   5. Upgrading the toolchain (the previous version was 6.0.x).
>>>>
>>>>Please help!
>>>>
>>>>--Randy Yates
>>>
>>>Someone on the TI forum determined the problem: I wasn't specifying
>>>the "b" option in fopen. fopen(file, "rb")
>>>
>>>--Randy
>>
>>OUCH!
>>
>>This is a Microsoftism - did the file contain a MS-DOS EOF
>>(Control-Z, 0x1a) at the point of breakage?
> 
> 
> Hi Tauno,
> 
> There is a 0x1a near there, yes. 
> 
> I do feel a bit foolish. I've known about the "b" flag for decades -
> just got careless. But also, TI's C file I/O has a history of
> squirrelishness (sp?), so I was suspecting that type of nonsense.
> Additionally, the exact same code ran fine when retargeted for
> pc/cygwin. 

A peer review of the code would reveal that error at once. However what 
is good about chasing such silly problem is that you probably fixed a 
dozen of suspicious places elsewhere, and now you can cite the 
documentation by heart.


Vladimir Vassilevsky
DSP and Mixed Signal Design Consultant
http://www.abvolt.com