Fri Aug 12 22:24:44 PDT 2005
- Previous message: [Slony1-general] query too complex after subscribing a set with big tables at busy time
- Next message: [Slony1-general] Replicating an existing large database
- Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]
Another not so obvious fix for the problem would be as follows. First let me try to explain what that actionseq list is coming from. When a new set is subscribed AND the provider is the origin, there is no way to get a snapshot of all the tables at exactly a SYNC point. Meaning that the copy operation will see some arbitrary state somewhere in the middle between two SYNC events. So the first ever done SYNC processing after the COPY_SET must filter out those actions that are already incorporated into the data since the last SYNC. Now, I "think" (didn't check the code actually) that the code generating that list of "actionseq's after last SYNC" grabs way too many of those rows. If it would apply the exact filter logic of "what log rows have been generated since the last SYNC I can see", then that actionseq list cannot contain more numbers than row operations happen between two sync events. Another and way more elegant possibility would be that the copy_set logic actually stores its very own serializable snapshot information as the current info in sl_setsync. That would eliminate the whole actionseq list altogether. Jan On 8/11/2005 12:43 PM, Hannu Krosing wrote: > On K, 2005-08-10 at 17:39 -0400, Christopher Browne wrote: >> Hannu Krosing <hannu at skype.net> writes: >> > On T, 2005-08-09 at 17:07 -0400, Christopher Browne wrote: >> > >> >> I've got a FSM whiteboarded that will efficiently parse the list of >> >> integers (when you can use an FSM, your problem has "blazingly fast" >> >> written all over it!); I'll have to do a bit of thinking about what to >> >> do about processing those integers :-). >> >> >> >> I now know the FSM part is easy; I'll have to sleep on the rest... >> > >> > my python code was just for testing the processing algorithm (I got it >> > work right in 4 tries :) >> > >> > ... >> >> I have a patch that passes the "test_1_pgbench" test, which definitely >> exercises it. (Albeit not with Just Insanely Big Queries...) >> >> If the set of "actionseq" values are being returned in random order, >> it won't turn out terribly well, but if they are in even a fairly >> approximate ordering, it'll be a BIG help. And it looks as though >> they are returned more or less in order. > > I think that expanding "NOT IN (...)" into a list of 'AND > log_actionseq != NNNNN' is what postgres query does anyway, so any of > these won't make things worse and any 'AND log_actionseq BETWEEN MMM AND > LLL' mixed in will reduce plan size and complexity. > >> I'd like for someone else to "use this patch in anger" before I >> consider adding it into CVS HEAD. > > I am quite busy right now, but I'll see if I can find some time tomorrow > to check it out. > > Thanks :) > -- #======================================================================# # It's easier to get forgiveness for being wrong than for being right. # # Let's break this rule - forgive me. # #================================================== JanWieck at Yahoo.com #
- Previous message: [Slony1-general] query too complex after subscribing a set with big tables at busy time
- Next message: [Slony1-general] Replicating an existing large database
- Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]
More information about the Slony1-general mailing list