Quickie: Strip ^M (Control-M) Characters from Input File with PERL
For anyone that does file I/O and has to sometimes work with Windows-generated files in Linux, I feel your pain. Windows has these little nuances that sometimes makes our GNU/Linux world a fun place to live. Luckily, PERL has a simple little system in place that allows us to remove control characters - Regular Expressions. Those not familiar will find great references at http://www.regular-expressions.info/ (a fantastic place to begin) and http://www.regextester.com/ (where you can test your brilliant work).
People who just want a quick piece of code, look no further:
If you want to do it all in one line from the CLI (of course replace *.txt with whatever extension):
perl -pi -w -e 's/\x0D//g;' *.txt
If you'd rather do it inline in a Perl script:
#Good Code $yourLine =~ s/\x0D//g; #strips ^M characters
Simply trying to strip ^M characters with
#Bad Code $yourLine =~ s/\^M//g; #strips ^M characters
unfortunately does not work. The previous hex value works great for me. I've run into this problem many times while taking third party data feeds which are sometimes generated in Windows and trying to preprocess them in my GNU/Linux environment. The ^M sends gets interpreted as a new line and wreaks havoc on feeds where you expect all the information to be on one line in a fixed number of columns.
For more information on control characters, please go to: http://www.cs.tut.fi/~jkorpela/chars.html where Jukka Korpela explains in-depth what control characters are, what issues you may have, and how you may go about resolving them.
- Login to post comments


![[FSF Associate Member] [FSF Associate Member]](http://www.ossolutions.org/lores/img/fsfMember.png)
Subscribe to this Feed