On Tue, 2006-02-14 at 11:06 -0600, David Gibbs wrote:
> Folks:
> 
> Does anyone know of a technique to determine if a file (in the IFS) is
> text or binary (in java)?

One definition of a 'text file' might be that there are no characters in
the file with a value greater than x'7F', or 127 (If it happens to be
ASCII data).  You then ought to consider if there are any low values
that you wish to prohibit, such as the NULL byte.

I suppose what I might do would be to count all of the high-bit
characters and NULLs and compare that to the total number of bytes.
Perhaps you can find a threshold wherein if the percentage of high-bit
characters gets to be too high, you consider the file to be a 'binary'
file.

It seems to recall that some of the 'what sort of file is it' routines
only look at the first 1K or 10K of the data, and make the judgment from
there.

I would think you could roll an is_binary_file class pretty quickly.

Regards,
Rich


As an Amazon Associate we earn from qualifying purchases.

This thread ...

Replies:

Follow On AppleNews
Return to Archive home page | Return to MIDRANGE.COM home page

This mailing list archive is Copyright 1997-2024 by midrange.com and David Gibbs as a compilation work. Use of the archive is restricted to research of a business or technical nature. Any other uses are prohibited. Full details are available on our policy page. If you have questions about this, please contact [javascript protected email address].

Operating expenses for this site are earned using the Amazon Associate program and Google Adsense.