|They see me rollin'|
Unfortunately I wasn't able to. I looked at pg_filedump and xlogdump, but I didn't really get out of them what I wanted. Ideally I would have liked to say something like 'CREATE TABLE bar AS SELECT * FROM foo WHERE corrupt_col IS NOT NULL USING SNAPSHOT ANY'. For those of you that are unfamiliar, there are a handful of global snapshots internal to Postgres that allow you to peek under the covers of the MVCC. 'SnapshotAny' is one that allows you to see "any" row in the relation, including deleted ones that haven't been vacuumed yet. As far as I could tell, there was no better solution then restoring from a backup.
So after being unable to meet the client's expectations, and some prodding by co-workers, I decided to see what I could come up with that would meet my needs. So today I made publicly available the first version of pg_dirtyread. It's an extension that provides a simple function for reading heap tuples from relations using SnapshotAny. It's not ideal as, among other drawbacks, it requires a full sequential scan of the relation, but at least it's a start.
SELECT * FROM pg_dirtyread('foo'::regclass) as t(bar bigint, baz text);Where the schema of foo is (bar bigint, baz text)
I'd be interested in any feedback on it as there is plenty of room for improvement. Maybe even some similar projects out there that my google-fu did not find for me. Also, I know the name is fallacious. A dirty read is reading rows that aren't committed. There is a different snapshot for that even. Maybe I should have called it pg_readany, but that name didn't sound as cool. Maybe in the future I will call it something like pg_readsnapshot and allow you to specify which snapshot you would like to use.