Apotelesm: This can't be right

Thursday, January 15, 2009

This can't be right

lines in the input file may contain a hex sequence interspersed among ascii text, in the form 0x{hex characters}.


import fileinput
import re
import codecs
utf = codecs.lookup('utf-8')
hex = re.compile("[0-9a-f]{2}")
pattern = re.compile("0x[0-9a-f]*")
def hexrepl(matchobj):
  input = matchobj.group(0)
  result = ""
  numbers = []
  for m in hex.findall(input):
    numbers.append(chr(int(m,16)))
  result = result.join(numbers) 
  return utf.encode(utf.decode(result)[0])[0]
for line in fileinput.input():
  line = re.sub(pattern,hexrepl,line)
  print line.rstrip()
exit

This feels overly complicated, and I'm only an occasional Python hack. I'm forgetting something.

Apotelesm

Thursday, January 15, 2009

This can't be right

No comments:

Other Bloggings

Labels

Blog Archive

About Me

Also At

License Statement