Friday, February 24, 2012

Distribution Agent Startup Parameters

Hello,
I have scenario where I use two servers: SQL 2K sp4, on Win2K3 sp1. One
server is a publisher, the other a subscriber. The replication topology is
Transaction Replication with 1 queued updating subscriber. The distribution
agent resides on the publisher.
Firstly, I am aware of nosync replication, but my environment requires some
more testing before I can implement it, because I can't take the publisher
offline, and I believe some tweaks are needed to ensure that the data will
truly be in sync. This said, one of my publications has a table article with
234 million records to transfer when the article is reinitialized. I am
having problems with the transaction log growing so large on the subscriber,
that it consumes all server space, and effectively kills the distribution
agent.
To mitigate this, I have tried to use the -BcpBatchSize parameter to try to
get the rows per batch down to manageable size so that I can manually shrink
the transaction log intermittantly while the distribution agent is applying
the snapshot. My strategy is not working.
Questions:
1) does -BcpBatchSize directly control the application of the snapshotted
article's transaction commit size, or is this parameter used outside of the
distribution agent's application of a snapshot?
2) does using -UseInprocLoader affect the -BcpBatchSize parameter? I ask
because bulk insert is different from bcp.
3) would -CommitBatchSize be a more appropriate setting to allow me to
actively manage the transaction log on the subscriber while this huge table
is being transferred?
Thanks for any input,
John T
If you are using concurrent snapshot on SQL2000, your only hope will be to
make sure that you hit the minimal logging code path when the snapshot is
delivered to the subscriber as the *entire* snapshot will be applied in an
atomic transaction. At the risk of telling you something that you may know
already, this means that you need to:
1) Make sure that 'select into\bulk copy' is enabled at the subscriber
database
2) The target table is empty with at most one index or not having any
indexes when the bulk copy occurs. This may be tricky to enforce as the
SQL2000 index creation logic in the distribution agent is not very precise.
You may want to modify the .idx scripts to leave only the primary key index
and then manually create other indexes afterwards.
3) Set -BcpBatchSize to be *larger* than the maximum number of rows in your
published tables. 2^31 - delta should work fine. Notice that you may start
seeing "Agent is waiting for database backend.." messages since the
distribution agent will not be able to log any progress messages until the
entire bulk load operation is done. You have also want to increase
the -QueryTimeout of the distribution agent.
-Raymond
"John T" <JohnT@.discussions.microsoft.com> wrote in message
news:469D5803-1C9E-447E-9513-2AE04EFEB305@.microsoft.com...
> Hello,
> I have scenario where I use two servers: SQL 2K sp4, on Win2K3 sp1. One
> server is a publisher, the other a subscriber. The replication topology
> is
> Transaction Replication with 1 queued updating subscriber. The
> distribution
> agent resides on the publisher.
> Firstly, I am aware of nosync replication, but my environment requires
> some
> more testing before I can implement it, because I can't take the publisher
> offline, and I believe some tweaks are needed to ensure that the data will
> truly be in sync. This said, one of my publications has a table article
> with
> 234 million records to transfer when the article is reinitialized. I am
> having problems with the transaction log growing so large on the
> subscriber,
> that it consumes all server space, and effectively kills the distribution
> agent.
> To mitigate this, I have tried to use the -BcpBatchSize parameter to try
> to
> get the rows per batch down to manageable size so that I can manually
> shrink
> the transaction log intermittantly while the distribution agent is
> applying
> the snapshot. My strategy is not working.
> Questions:
> 1) does -BcpBatchSize directly control the application of the snapshotted
> article's transaction commit size, or is this parameter used outside of
> the
> distribution agent's application of a snapshot?
> 2) does using -UseInprocLoader affect the -BcpBatchSize parameter? I ask
> because bulk insert is different from bcp.
> 3) would -CommitBatchSize be a more appropriate setting to allow me to
> actively manage the transaction log on the subscriber while this huge
> table
> is being transferred?
> Thanks for any input,
> John T
|||Hi Raymond, I though bcpBatchSize is analogous to the batchsize in bcp,
which means that it controls how many rows will be applied at a time in a
batch. Won't setting it to a large value create a very large transaction log
for large tables?
Hilary Cotter
Looking for a SQL Server replication book?
http://www.nwsu.com/0974973602.html
Looking for a FAQ on Indexing Services/SQL FTS
http://www.indexserverfaq.com
"Raymond Mak [MSFT]" <rmak@.online.microsoft.com> wrote in message
news:evddMLk8GHA.4084@.TK2MSFTNGP05.phx.gbl...
> If you are using concurrent snapshot on SQL2000, your only hope will be to
> make sure that you hit the minimal logging code path when the snapshot is
> delivered to the subscriber as the *entire* snapshot will be applied in an
> atomic transaction. At the risk of telling you something that you may know
> already, this means that you need to:
> 1) Make sure that 'select into\bulk copy' is enabled at the subscriber
> database
> 2) The target table is empty with at most one index or not having any
> indexes when the bulk copy occurs. This may be tricky to enforce as the
> SQL2000 index creation logic in the distribution agent is not very
> precise. You may want to modify the .idx scripts to leave only the primary
> key index and then manually create other indexes afterwards.
> 3) Set -BcpBatchSize to be *larger* than the maximum number of rows in
> your published tables. 2^31 - delta should work fine. Notice that you may
> start seeing "Agent is waiting for database backend.." messages since the
> distribution agent will not be able to log any progress messages until the
> entire bulk load operation is done. You have also want to increase
> the -QueryTimeout of the distribution agent.
> -Raymond
> "John T" <JohnT@.discussions.microsoft.com> wrote in message
> news:469D5803-1C9E-447E-9513-2AE04EFEB305@.microsoft.com...
>
|||Hilary, what you said is true, but the goal of my reccomendations is not to
have small transactions but to hit the minimal logging (or bulk-logged) code
path in the server which is the only feasible way to have a manageable log
size when a concurrent snapshot is applied in SQL2000. If a small batch size
is specified, the server will see the target table as non-empty after the
first batch is loaded and start going down the fully-logged code path after
that. And since there is an outer transaction bounding the delivery of a
concurrent snapshot in SQL2000, the log simply cannot be truncated even with
a small bcp batch size. I can understand why you think this is
counter-intuitive, but my suggestion had worked on a few occasions before
including the following:
http://groups.google.com/group/micro... e49f1fb79fe5
-Raymond
"Hilary Cotter" <hilary.cotter@.gmail.com> wrote in message
news:uuGAtgk8GHA.2316@.TK2MSFTNGP04.phx.gbl...
> Hi Raymond, I though bcpBatchSize is analogous to the batchsize in bcp,
> which means that it controls how many rows will be applied at a time in a
> batch. Won't setting it to a large value create a very large transaction
> log for large tables?
> --
> Hilary Cotter
> Looking for a SQL Server replication book?
> http://www.nwsu.com/0974973602.html
> Looking for a FAQ on Indexing Services/SQL FTS
> http://www.indexserverfaq.com
> "Raymond Mak [MSFT]" <rmak@.online.microsoft.com> wrote in message
> news:evddMLk8GHA.4084@.TK2MSFTNGP05.phx.gbl...
>
|||Ok, got it! Thanks Raymond. Still waiting on that beer you owe me;)
Hilary Cotter
Director of Text Mining and Database Strategy
RelevantNOISE.Com - Dedicated to mining blogs for business intelligence.
This posting is my own and doesn't necessarily represent RelevantNoise's
positions, strategies or opinions.
Looking for a SQL Server replication book?
http://www.nwsu.com/0974973602.html
Looking for a FAQ on Indexing Services/SQL FTS
http://www.indexserverfaq.com
"Raymond Mak [MSFT]" <rmak@.online.microsoft.com> wrote in message
news:Oy1jKJl8GHA.2120@.TK2MSFTNGP03.phx.gbl...
> Hilary, what you said is true, but the goal of my reccomendations is not
> to have small transactions but to hit the minimal logging (or bulk-logged)
> code path in the server which is the only feasible way to have a
> manageable log size when a concurrent snapshot is applied in SQL2000. If a
> small batch size is specified, the server will see the target table as
> non-empty after the first batch is loaded and start going down the
> fully-logged code path after that. And since there is an outer transaction
> bounding the delivery of a concurrent snapshot in SQL2000, the log simply
> cannot be truncated even with a small bcp batch size. I can understand why
> you think this is counter-intuitive, but my suggestion had worked on a few
> occasions before including the following:
> http://groups.google.com/group/micro... e49f1fb79fe5
> -Raymond
> "Hilary Cotter" <hilary.cotter@.gmail.com> wrote in message
> news:uuGAtgk8GHA.2316@.TK2MSFTNGP04.phx.gbl...
>
|||Raymond,
Just a couple of points of clarification:
1) If I am using simple recovery mode, I should get the same benefits in
this case as using bulk recovery mode, correct?
2) If I choose the schema option to drop the tables before the snapshot is
applied, this should meet all of the requirements in your second point,
correct?
3) Regarding the -BcpBatchSize setting of 2^31 - delta, what is the delta,
my desired batch commit size?
Thanks
John T
"Raymond Mak [MSFT]" wrote:

> If you are using concurrent snapshot on SQL2000, your only hope will be to
> make sure that you hit the minimal logging code path when the snapshot is
> delivered to the subscriber as the *entire* snapshot will be applied in an
> atomic transaction. At the risk of telling you something that you may know
> already, this means that you need to:
> 1) Make sure that 'select into\bulk copy' is enabled at the subscriber
> database
> 2) The target table is empty with at most one index or not having any
> indexes when the bulk copy occurs. This may be tricky to enforce as the
> SQL2000 index creation logic in the distribution agent is not very precise.
> You may want to modify the .idx scripts to leave only the primary key index
> and then manually create other indexes afterwards.
> 3) Set -BcpBatchSize to be *larger* than the maximum number of rows in your
> published tables. 2^31 - delta should work fine. Notice that you may start
> seeing "Agent is waiting for database backend.." messages since the
> distribution agent will not be able to log any progress messages until the
> entire bulk load operation is done. You have also want to increase
> the -QueryTimeout of the distribution agent.
> -Raymond
> "John T" <JohnT@.discussions.microsoft.com> wrote in message
> news:469D5803-1C9E-447E-9513-2AE04EFEB305@.microsoft.com...
>
>
|||Hi John,
In response to your inquiry:
1) Simple recovery mode is fine.
2) You may still need to make sure that only the primary key index got
created since the index creation logic in SQL2000 distribution agent is not
as optimal as it can be.
3) Delta is just a small number, you only need to make sure that the
resulting bcpbatchsize is bigger than the maximum number of rows in any of
your tables
Hope that helps,
-Raymond
"John T" <JohnT@.discussions.microsoft.com> wrote in message
news:A677C2FE-F672-4473-9B2E-725AFE500039@.microsoft.com...[vbcol=seagreen]
> Raymond,
> Just a couple of points of clarification:
> 1) If I am using simple recovery mode, I should get the same benefits in
> this case as using bulk recovery mode, correct?
> 2) If I choose the schema option to drop the tables before the snapshot is
> applied, this should meet all of the requirements in your second point,
> correct?
> 3) Regarding the -BcpBatchSize setting of 2^31 - delta, what is the delta,
> my desired batch commit size?
> Thanks
> John T
> "Raymond Mak [MSFT]" wrote:
|||Wait a minute, I think you are the one who owe me a beer, and I have proof
for that...
http://groups.google.com/group/micro... c299fb7328e2d
"Hilary Cotter" <hilary.cotter@.gmail.com> wrote in message
news:%23yI5FQr8GHA.3820@.TK2MSFTNGP02.phx.gbl...
> Ok, got it! Thanks Raymond. Still waiting on that beer you owe me;)
> --
> Hilary Cotter
> Director of Text Mining and Database Strategy
> RelevantNOISE.Com - Dedicated to mining blogs for business intelligence.
> This posting is my own and doesn't necessarily represent RelevantNoise's
> positions, strategies or opinions.
> Looking for a SQL Server replication book?
> http://www.nwsu.com/0974973602.html
> Looking for a FAQ on Indexing Services/SQL FTS
> http://www.indexserverfaq.com
>
> "Raymond Mak [MSFT]" <rmak@.online.microsoft.com> wrote in message
> news:Oy1jKJl8GHA.2120@.TK2MSFTNGP03.phx.gbl...
>
|||freaking Google.
Hilary Cotter
Director of Text Mining and Database Strategy
RelevantNOISE.Com - Dedicated to mining blogs for business intelligence.
This posting is my own and doesn't necessarily represent RelevantNoise's
positions, strategies or opinions.
Looking for a SQL Server replication book?
http://www.nwsu.com/0974973602.html
Looking for a FAQ on Indexing Services/SQL FTS
http://www.indexserverfaq.com
"Raymond Mak [MSFT]" <rmak@.online.microsoft.com> wrote in message
news:enfPtCt8GHA.1492@.TK2MSFTNGP02.phx.gbl...
> Wait a minute, I think you are the one who owe me a beer, and I have proof
> for that...
> http://groups.google.com/group/micro... c299fb7328e2d
> "Hilary Cotter" <hilary.cotter@.gmail.com> wrote in message
> news:%23yI5FQr8GHA.3820@.TK2MSFTNGP02.phx.gbl...
>
|||This suggestion worked excellently. I went from filling up over 275 gigs of
drive space trying to deploy these large tables, to only using 975 megs of
drive space.
Thank you,
John T
"Raymond Mak [MSFT]" wrote:

> Hi John,
> In response to your inquiry:
> 1) Simple recovery mode is fine.
> 2) You may still need to make sure that only the primary key index got
> created since the index creation logic in SQL2000 distribution agent is not
> as optimal as it can be.
> 3) Delta is just a small number, you only need to make sure that the
> resulting bcpbatchsize is bigger than the maximum number of rows in any of
> your tables
> Hope that helps,
> -Raymond
> "John T" <JohnT@.discussions.microsoft.com> wrote in message
> news:A677C2FE-F672-4473-9B2E-725AFE500039@.microsoft.com...
>
>

No comments:

Post a Comment