gdocter Posted July 24, 2003 Posted July 24, 2003 Is there a way to determine whether a file was written in binary format? All I can find is text encoding; not non-text encoding.. Quote
*Gurus* Derek Stone Posted July 25, 2003 *Gurus* Posted July 25, 2003 A file is always binary. If the file was ANSI/Unicode/UTF encoded, it's still binary, however it is interpreted as text. Quote Posting Guidelines
gdocter Posted July 28, 2003 Author Posted July 28, 2003 re: binary Thanks, I knew that ;) but .. let me rephrase: How do you determine, then, that no encoding was used whatsoever ..? Quote
*Experts* Volte Posted July 28, 2003 *Experts* Posted July 28, 2003 Generally files of a certain type contain a header. For example, a .GIF file contains a header that says "GIF87a" or "GIF89a" so that programs can determine that it is a GIF. All files can be interpreted as text or interpreted as binary. Quote
*Gurus* Derek Stone Posted July 28, 2003 *Gurus* Posted July 28, 2003 You can't for sure, that's the whole point. Each of the encoding types has its own byte-level signature, and some offer bit order marks, however there's nothing mandating that signature to only be used in text files. This is part of the reason why file extensions exist-- to indicate what type of file is being dealt with. For example: ASCII uses 7 bits to represent one character. UTF-8 uses one to six octets per character, with the initial octet serving as both an indicator of the number of subsequently used octets and a portion of the character value. UTF-8 is also marked with an opening byte sequence of EF BB BF. And while UTF-8 is relatively easy to spot, there's absolutely nothing to distinguish ASCII with other than by checking whether or not there are bytes in the file that don't map to characters. A null value, 00, would be one indication that the file is not ASCII-encoded for example. Quote Posting Guidelines
Recommended Posts
Join the conversation
You can post now and register later. If you have an account, sign in now to post with your account.