paperlined.org
apps
document updated 24 days ago, on Nov 10, 2022

auto-detecting a file's type

see also — en.wikipedia.org/wiki/Content_sniffing

tools available

I really like using file(1) and magic(5) to auto-detect a file's type based on its contents. Its database covers MANY different file types, and it generally seems to get things right. However, it IS still a guess, and sometimes guesses are wrong.

text vs binary

You should be aware that auto-detecting whether a file is text is a guess/heuristic, and different heuristics often disagree with each other about whether a particular file is text.

Some tools that contain text vs binary heuristics:

auto-detecting character encoding within text files

see also — en.wikipedia.org/wiki/Charset_detection

Some tools that can do this:

Perl source files

Some tools that can auto-detect if a file contains Perl source: