Hi, Every so often in table sl_event appears invalid ev_snapshot value. Some txid doubles (like 379383437 below), and postgres complains about invalid input for txid_snapshot type: select * from "pg_catalog".txid_snapshot_xip('379383396:379383451:379383396,379383437,379383437,379383442') ) order by log_actionseq" PGRES_FATAL_ERROR ERROR: invalid input for txid_snapshot: "379383396:379383451:379383396,379383437,379383437,379383442" LINE 1: ...d "pg_catalog".txid_visible_in_snapshot(log_txid, '379383396... The only way to get rid of replication lag is to update sl_event table. For example replacing "379383396:379383451:379383396,379383437,379383437,379383442" with "379383396:379383451:379383396,379383437,379383442" solves the problem. Best regards
Created attachment 162 [details] Slony log file
What are the exact PostgreSQL and Slony versions?
PostgreSQL 9.2.4 Slony 2.1.3
I am still hunting this one. Running a multi-client pgbench for hours and using txid snapshot in-out functions all the time alongside, I was not yet able to create a single occurrence of this problem. Does your application by any chance use subtransactions, like "exceptions" in PL/pgSQL?
(In reply to comment #4) > I am still hunting this one. > > Running a multi-client pgbench for hours and using txid snapshot in-out > functions all the time alongside, I was not yet able to create a single > occurrence of this problem. Does your application by any chance use > subtransactions, like "exceptions" in PL/pgSQL? Yes, we use "exceptions" in pl/pgslq. I'll try to check, if this could be the source of problem. We also use prepared transactions managed by application server. Maybe two-phase commits affects slony operation?
I don't think that this bug has anything with Slony in particular. The txid_snapshot used by Slony is simply the output of txid_current_snapshot() as created through the txid_snapshot's data type output function. This seems to be a bug in PostgreSQL itself, which I still wasn't able to reproduce.
A discussion on pgsql-hackers has revealed that this is a bug in PostgreSQL connected to two-phase commit. There is apparently a small window in which two PGPROC entries are visible with the same xid. I am proposing a patch to PostgreSQL that will remove duplicate xip entries in txid_current_snapshot() and ignore existing duplicates in txid_snapshot_in().
Curious if this patch has been provided? It would help us in getting rid of the manual process of removing the duplicate entries in sl_event Thanks (In reply to comment #7) > A discussion on pgsql-hackers has revealed that this is a bug in PostgreSQL > connected to two-phase commit. There is apparently a small window in which two > PGPROC entries are visible with the same xid. > > I am proposing a patch to PostgreSQL that will remove duplicate xip entries in > txid_current_snapshot() and ignore existing duplicates in txid_snapshot_in().
http://git.postgresql.org/gitweb/?p=postgresql.git;a=commit;h=8f9b9590d79fc1fc1ad08b207401acfdbb0bfac7 on head If you search around you should be able to find the commit on REL9.2 but I would just recommend upgrading to the latest 9.2 minor release (a hear a new PG release is coming next week)
We recently had this again and we are running postgresql 9.3.4 with slony 2.2.2 (In reply to comment #9) > > http://git.postgresql.org/gitweb/?p=postgresql.git;a=commit;h=8f9b9590d79fc1fc1ad08b207401acfdbb0bfac7 > on head > > If you search around you should be able to find the commit on REL9.2 but I > would just recommend upgrading to the latest 9.2 minor release (a hear a new PG > release is coming next week)
Looking at the git logs, This patch is not included in 9.3.4 I would expect it to be included in 9.3.5
PostgreSQL 9.3.5 has been released today. I was going through the release notes and it does not mention anything about this fix. Is there a way you can confirm by looking at the git logs if it actually went into this release. Thanks for your help. (In reply to comment #11) > Looking at the git logs, > > This patch is not included in 9.3.4 I would expect it to be included in 9.3.5
It is in 9.3.5 according to the git logs http://git.postgresql.org/gitweb/?p=postgresql.git;a=log;h=refs/tags/REL9_3_5 search for "Handle duplicate XIDs in txid_snapshot." I am marking this bug as resolved since the fix is now included in a released version of PG
*** Bug 340 has been marked as a duplicate of this bug. ***