I have around 75k records which I am loading to a Postgres table using copy command which is failing. I get an exception
ERROR: invalid byte sequence for encoding "UTF8": 0xbd
Now i need to find which line is having this entry. Is there any way to do this? I am thinking in lines of enabling some postgres logging that might help or any other solution Note: I am getting the issue with only one particular file. Other files are getting loaded without issues
I always seem to get a line-number in my error, no matter whether I use COPY
or \copy
and feed a file via redirection or -f
.
ERROR: invalid byte sequence for encoding "UTF8": 0xa3
CONTEXT: COPY z, line 3
If there are only a couple of bad chars and you just want to strip them you can use iconv (assuming you're on a unix-like system).
iconv -c --from=utf8 --to=utf8 /tmp/badchars.txt > /tmp/stripped.txt
You could always run diff against the before + after versions if you wanted to see what was stripped out.