Showing posts with label cleanup. Show all posts
Showing posts with label cleanup. Show all posts

Wednesday, March 7, 2012

Distribution table cleanup

Using SQL Server 2005. Replication working fine except the distribution table is continuely growing. Started to configure maintenance job (catagory: REPL-Distribution Cleanup) Any suggested steps that will not break the replication. Is there a SP available that will address my problem.

Thanks,

You can reduce the retention period using sp_changedistributiondb.

For more info, please refer to the following articles.

http://msdn2.microsoft.com/fr-fr/library/ms146868.aspx

http://msdn2.microsoft.com/fr-fr/library/ms181877.aspx

Regards,

Gary

Distribution table cleanup

Using SQL Server 2005. Replication working fine except the distribution table is continuely growing. Started to configure maintenance job (catagory: REPL-Distribution Cleanup) Any suggested steps that will not break the replication. Is there a SP available that will address my problem.

Thanks,

You can reduce the retention period using sp_changedistributiondb.

For more info, please refer to the following articles.

http://msdn2.microsoft.com/fr-fr/library/ms146868.aspx

http://msdn2.microsoft.com/fr-fr/library/ms181877.aspx

Regards,

Gary

Saturday, February 25, 2012

distribution database having problems??

Something is wrong with my distribtion database. When the distribution clean
up agent runs (or I run dbo.sp_MSdistribution_cleanup manually)
I get the following error message:
Executed as user: sa. Table error: Database ID 9, object ID 5575058, index
ID 1. Chain linkage mismatch. (1:89299)->next = (1:59339), but
(1:59339)->prev = (0:0). [SQLSTATE HY000] (Error 8908). The step failed.
SO then I run
DBCC CHECKDB
on the distribution database. The MSrepl_commands is the table with the
problem with repeated messages such as:
Table error: Object ID 5575058, index ID 1. B-tree page (1:91019) has two
parent nodes (1:93624), slot 103 and (1:89299), slot 1.
I run this:
dbcc checktable ('MSrepl_commands')
and get the same type of error message.
I have no idea how to fix this. Any advise?? (If I try and use the repair
options of the dbcc commands than it says I have to have the DB in single
user mode. Not sure if you are even suppose to do that with a distribution
db?)
Thanks so much for any and all help,
Kristy
Believe it or not these problems crop up from time to time on replicated
databases. Stop the sql server agent on the publisher, yank the network
cable (if possible), and put this database in single user mode and do the
repair.
Hilary Cotter
Director of Text Mining and Database Strategy
RelevantNOISE.Com - Dedicated to mining blogs for business intelligence.
This posting is my own and doesn't necessarily represent RelevantNoise's
positions, strategies or opinions.
Looking for a SQL Server replication book?
http://www.nwsu.com/0974973602.html
Looking for a FAQ on Indexing Services/SQL FTS
http://www.indexserverfaq.com
"Kristy" <pleasereplyby@.posting.com> wrote in message
news:OCFMw7kPGHA.1696@.TK2MSFTNGP14.phx.gbl...
> Something is wrong with my distribtion database. When the distribution
> clean
> up agent runs (or I run dbo.sp_MSdistribution_cleanup manually)
> I get the following error message:
> Executed as user: sa. Table error: Database ID 9, object ID 5575058, index
> ID 1. Chain linkage mismatch. (1:89299)->next = (1:59339), but
> (1:59339)->prev = (0:0). [SQLSTATE HY000] (Error 8908). The step failed.
> SO then I run
> DBCC CHECKDB
> on the distribution database. The MSrepl_commands is the table with the
> problem with repeated messages such as:
> Table error: Object ID 5575058, index ID 1. B-tree page (1:91019) has two
> parent nodes (1:93624), slot 103 and (1:89299), slot 1.
> I run this:
> dbcc checktable ('MSrepl_commands')
> and get the same type of error message.
> I have no idea how to fix this. Any advise?? (If I try and use the
> repair
> options of the dbcc commands than it says I have to have the DB in single
> user mode. Not sure if you are even suppose to do that with a distribution
> db?)
> Thanks so much for any and all help,
> Kristy
>
>
|||Thanks Hilary. I will check to see if we can yank the network cable. Not
sure if we can though because it is on a remote server hosted by another
company in another state.
If not, is there something else I can do? (Besides redo replication and drop
the distribution DB.)
As always, thanks!
--Kristy
"Hilary Cotter" <hilary.cotter@.gmail.com> wrote in message
news:eODseSmPGHA.5516@.TK2MSFTNGP09.phx.gbl...[vbcol=seagreen]
> Believe it or not these problems crop up from time to time on replicated
> databases. Stop the sql server agent on the publisher, yank the network
> cable (if possible), and put this database in single user mode and do the
> repair.
> --
> Hilary Cotter
> Director of Text Mining and Database Strategy
> RelevantNOISE.Com - Dedicated to mining blogs for business intelligence.
> This posting is my own and doesn't necessarily represent RelevantNoise's
> positions, strategies or opinions.
> Looking for a SQL Server replication book?
> http://www.nwsu.com/0974973602.html
> Looking for a FAQ on Indexing Services/SQL FTS
> http://www.indexserverfaq.com
>
> "Kristy" <pleasereplyby@.posting.com> wrote in message
> news:OCFMw7kPGHA.1696@.TK2MSFTNGP14.phx.gbl...
index[vbcol=seagreen]
failed.[vbcol=seagreen]
two[vbcol=seagreen]
single[vbcol=seagreen]
distribution
>

Distribution Cleanup Job Fails

The distribution cleanup job that runs for our transactional replication on
sql server 2005 fails.
The job is:
EXEC dbo.sp_MSdistribution_cleanup @.min_distretention = 0,
@.max_distretention = 72
The Error is:
Msg 20015, Level 16, State 1, Procedure sp_MSreplremoveuncdir, Line 83
Could not remove directory
'\\sprs76\rysrepldata\unc\SPRS76_RYS_RYS_OPERATION ALTOREPLICAHIST\20060713140760\'.
Check the security context of xp_cmdshell and close other processes that may
be accessing the directory.
Replication-@.rowcount_only parameter must be the value 0,1, or 2. 0=7.0
compatible checksum. 1=only check rowcou: agent
RYSSprs76_distribution@.rowcount_only parameter must be the value 0,1, or 2.
0=7.0 compatible checksu scheduled for retry. Could not clean up the
distribution transaction tables.
If I remove this directory manually, would it cause the replication to
fail? I have tried this on a Test server and it seems that the replication
fails.
I tried to run this job in the Query window and it return this same error.
Our data file is growing and I cannot shrink it because of this issue.
What can i do to resolve this?
Thanks,
George Gopie
There's an MMC snapin to look at shared folders and their connections and
open files. This snapin could be used to drop the connections where needed.
If you have the folder open yourself locally, this snapin won't pick it up,
so you'd have to use the task manager to see what is likely to be accessing
it.
Another possibility is to drop the folder yourself then recreate it for the
distribution agent to later delete.
Cheers,
Paul Ibison SQL Server MVP, www.replicationanswers.com .
|||No it would not cause replication to fail, only remove this portion
20060713140760
Make sure that xp_cmdshell is enabled and the SQL Serer agent account has
rights to list files and folders and delete files on the root of the
repldata directory and it subdirectories.
Hilary Cotter
Director of Text Mining and Database Strategy
RelevantNOISE.Com - Dedicated to mining blogs for business intelligence.
This posting is my own and doesn't necessarily represent RelevantNoise's
positions, strategies or opinions.
Looking for a SQL Server replication book?
http://www.nwsu.com/0974973602.html
Looking for a FAQ on Indexing Services/SQL FTS
http://www.indexserverfaq.com
"georgeg" <ggg@.hotamil.com> wrote in message
news:DFE68859-01CF-4236-AC2A-F9CA596B9482@.microsoft.com...
> The distribution cleanup job that runs for our transactional replication
> on
> sql server 2005 fails.
> The job is:
> EXEC dbo.sp_MSdistribution_cleanup @.min_distretention = 0,
> @.max_distretention = 72
> The Error is:
> Msg 20015, Level 16, State 1, Procedure sp_MSreplremoveuncdir, Line 83
> Could not remove directory
> '\\sprs76\rysrepldata\unc\SPRS76_RYS_RYS_OPERATION ALTOREPLICAHIST\20060713140760\'.
> Check the security context of xp_cmdshell and close other processes that
> may
> be accessing the directory.
> Replication-@.rowcount_only parameter must be the value 0,1, or 2. 0=7.0
> compatible checksum. 1=only check rowcou: agent
> RYSSprs76_distribution@.rowcount_only parameter must be the value 0,1, or
> 2.
> 0=7.0 compatible checksu scheduled for retry. Could not clean up the
> distribution transaction tables.
>
> If I remove this directory manually, would it cause the replication to
> fail? I have tried this on a Test server and it seems that the replication
> fails.
> I tried to run this job in the Query window and it return this same error.
> Our data file is growing and I cannot shrink it because of this issue.
> What can i do to resolve this?
> Thanks,
> --
> George Gopie
>

Distribution CleanUp creates latency

Hi,
We have one publisher, one distributor and two subscriber. We run
transactional replication.
We recently upgraded all our hardware but now we find ourselves with an
unacceptable latency. When the "Distribution Cleanup" process fires it
take up to 4 minutes to run and will cause a complete replication pause
for 1-2 minutes at times.
In a perfect world, latency would always be under 2 seconds. I'll live
with the very seldom latency of 10 seconds.
What can be done to tame the distribution cleanup jog.
Regards,
CanadianGambler
*** Sent via Developersdex http://www.codecomments.com ***
Nothing. You could run the distribution clean up job nightly and see how the
pooled commands affect performance.
Hilary Cotter
Looking for a SQL Server replication book?
http://www.nwsu.com/0974973602.html
Looking for a FAQ on Indexing Services/SQL FTS
http://www.indexserverfaq.com
"Canadian Gambler" <canadiangambler@.hotmail.com> wrote in message
news:%23xLD5HnsFHA.1252@.TK2MSFTNGP09.phx.gbl...
> Hi,
> We have one publisher, one distributor and two subscriber. We run
> transactional replication.
> We recently upgraded all our hardware but now we find ourselves with an
> unacceptable latency. When the "Distribution Cleanup" process fires it
> take up to 4 minutes to run and will cause a complete replication pause
> for 1-2 minutes at times.
> In a perfect world, latency would always be under 2 seconds. I'll live
> with the very seldom latency of 10 seconds.
> What can be done to tame the distribution cleanup jog.
> Regards,
> CanadianGambler
> *** Sent via Developersdex http://www.codecomments.com ***
|||I guess I should provide more info.
We relocated our equipment room and in the process upgraded the hardware
for all our database servers (1 publisher, 1 distributor and 2
subscribers).
We finished this move 2 weeks ago. The serious latency issue only
started about 6 days ago. We ran the same setup at the previous
location for about 15 months without latency greater than a few seconds.
So something is not quite right. Read on, help is on the way ...
After doing more research last night and this morning we find that some
of the tables that are "suppose to be cleaned up" contain more than
12,000,000 records.
So we are now approaching this problem from a different angle. 1.) Why
are those tables so big and the "clean up" job not "cleaning up". 2.)
How do we go about cleaning those up so that the "clean up job" doesn't
take so long to do it's thing. I saw some similar post so I'm hoping
that I will find some industry wisdom that can help us figure this one
out.
Regards,
Canadian Gambler
*** Sent via Developersdex http://www.codecomments.com ***

distribution cleanup cannot clean up snapshot folder

We are using SQL Server 2005 SP2 to do transactional replication.

We and have a separate service account for the SQL Agents (sqladmin) vs. SQL Replication Agents (sqlrepadmin). It is my understanding this is a replication security best practice. The sqlrepadmin has full permissions on the snapshot share folder and it's subdirectories. The sqladmin account does not have permissions at all.

I have been getting an error message when we run the distribution clean up job.

Executed as user: PROD\sqladmin. Could not remove directory '\\Tes01box\Repldata\unc\qabox01_DB01_TO_ORACLE\20070905104896\'. Check the security context of xp_cmdshell

I have dropped the publication and recreated which is what appears to have caused the error.

From

http://technet.microsoft.com/en-us/library/ms151151.aspx

Note:

If a publication is dropped, replication attempts to remove the snapshot folder under the security context of the SQL Server service account. If this account does not have sufficient privileges, log in with an account that does have sufficient privileges and remove the folder manually. Removing a folder requires the Modify privilege if the folder is a local path or the Full Control privilege if the folder is a network path.

The note above implies that the SQL Server service account (sqladmin) needs permissions on the snapshot folder as well.

Finally my questions:

Is there a workaround that will allow the distribution cleanup job to run as sqlrepadmin and perform the delete?

If both sqlrepadmin and sqladmin need permissions to the snapshot what is the reasoning from a security perspective of separating them out?

open up your distribution clean up task job, and in the job step properties do the following:

setuser 'sqlrepadmin'
GO
EXEC dbo.sp_MSdistribution_cleanup @.min_distretention = 0, @.max_distretention = 72
GO|||


I did have to qualify the domain, as in DOMAIN\sqlrepadmin to get it to work. We have same login for 2 different domains.

What an amazingly simple and elegant solution. I feel silly that I did not think of this.

Thank you Hilary.

distribution cleanup cannot clean up snapshot folder

We are using SQL Server 2005 SP2 to do transactional replication.

We and have a separate service account for the SQL Agents (sqladmin) vs. SQL Replication Agents (sqlrepadmin). It is my understanding this is a replication security best practice. The sqlrepadmin has full permissions on the snapshot share folder and it's subdirectories. The sqladmin account does not have permissions at all.

I have been getting an error message when we run the distribution clean up job.

Executed as user: PROD\sqladmin. Could not remove directory '\\Tes01box\Repldata\unc\qabox01_DB01_TO_ORACLE\20070905104896\'. Check the security context of xp_cmdshell

I have dropped the publication and recreated which is what appears to have caused the error.

From

http://technet.microsoft.com/en-us/library/ms151151.aspx

Note:

If a publication is dropped, replication attempts to remove the snapshot folder under the security context of the SQL Server service account. If this account does not have sufficient privileges, log in with an account that does have sufficient privileges and remove the folder manually. Removing a folder requires the Modify privilege if the folder is a local path or the Full Control privilege if the folder is a network path.

The note above implies that the SQL Server service account (sqladmin) needs permissions on the snapshot folder as well.

Finally my questions:

Is there a workaround that will allow the distribution cleanup job to run as sqlrepadmin and perform the delete?

If both sqlrepadmin and sqladmin need permissions to the snapshot what is the reasoning from a security perspective of separating them out?

open up your distribution clean up task job, and in the job step properties do the following:

setuser 'sqlrepadmin'
GO
EXEC dbo.sp_MSdistribution_cleanup @.min_distretention = 0, @.max_distretention = 72
GO|||


I did have to qualify the domain, as in DOMAIN\sqlrepadmin to get it to work. We have same login for 2 different domains.

What an amazingly simple and elegant solution. I feel silly that I did not think of this.

Thank you Hilary.

Distribution Cleanup Acting Weird

Hi.
In a previous post I asked if there was anything that we could do to
help the cleanup job run better. We then realized that we had over
12,000,000 records in MSreplcommands.
We have been "reducing" the max retention parameter. It was sitting at
72 hours. We started reducing this parameter by 1 hour, then 2 hours,
then 4 hours and the job would complete in about 4-5 minutes. It would
delete approximately 5000 rows/sec.
Suddenly the same job takes several hours while deleting around 80
rows/sec. This is horrible. Deleting 4 hours of old command takes 4
hours. I'll never be able to catch up.
Is there anything that we can do to bring back the 5000 rows/sec
performance of the earlier calls? What would cause the performance of
this job to decrease so drastically?
We're at a point where we are contemplating rebuilding the distributor
to start with a clean slate and set the distribution cleanup to a low
max retention figure (like 1 or 2 hours). We're hoping that with such a
low figure the job will not take so long to run and not block
replication for more than a few seconds.
One more thing. We're going to run without the distribution cleanup job
for the weekend where we have most of our activity. Will there be any
issues with replication if the distribution tables get super huge.
Regards,
Canadian Gambler.
*** Sent via Developersdex http://www.codecomments.com ***
You may be experiencing blocking betweek the cleanup agent, the logreader
agent which writes to the MSreplcommands table and the distribution agent
which reads from the same table. I'd recommend suspending the log reader and
distribution agents when you give the cleanup agent a decent time to clear
out the table. Your transaction logs may get large during this time, and
you'll need to be careful not to stray anywhere close to the distribution
retention period. If this succeeds, run the log reader and distribution
agents and suspend the cleanup agent while they clear the backlog from the
transaction log.
The above applies if the issue is indeed blocking. When dealing with several
million transactions per day I also got into the situation where the
MSreplcommands table balooned and transactions weren't removed even after
being applied to the single subscriber. Only when the retention period was
reached would they be removed. If this applies to your case, consider
synchronizing then really reduce the retention period.
HTH,
Paul Ibison SQL Server MVP, www.replicationanswers.com
(recommended sql server 2000 replication book:
http://www.nwsu.com/0974973602p.html)
|||Thanks for the help.
this morning we pointed all the ODBCs to the main database, stop
replication and ran the CleanUp job. It took as long to complete.
We will have to synchronize next week, blow up the distribution database
and start fresh. Any issues setting the max retention as low as 1 or 2
hours.
Regards,
Canadian Gambler.
*** Sent via Developersdex http://www.codecomments.com ***
|||A low value for the distribution retention period is a bit risky. You're
assuming that all the transactions will get delivered tothe subscriber
during this period, and if there is a network problem, you'll have to
remember to immediately stop the cleanup agent. I've only ever set a low
value after synchronization to remove the redundant records from
MSreplcommands and then reset afterwards.
Cheers,
Paul Ibison SQL Server MVP, www.replicationanswers.com
(recommended sql server 2000 replication book:
http://www.nwsu.com/0974973602p.html)

Distribution cleanup

Are transactions that have been distributed to all subscribers always stored the maximum retention period in the distribution database?

I'm using peer-to-peer replication.

No, they are stored the minimum retention period and are cleanup by the distribution clean up task. If a subscriber is offline, the transactions will be stored for the maximum retention period unless the subscriber is expired by this history retenion level.|||

But if the minimum retention period is set to 0 shouldn't all delivered transactions be cleaned up in that case.

In my peer-to-peer replication I have a publication with four subscribers and everything works fine all transactions are replicated and I don't have any latency problems.

So why is the MSrepl_commands still growing.

I can see in the output for distribution cleanup job that transactions actually are cleaned up but the MSrepl_commands grows. From time to time a the distribution cleanup job gets selected as a deadlock victim.

The size of the distribution db is now about 50GB with 100'000'000 rows in the MSrepl_commands table.

/Peter

Friday, February 24, 2012

Distribution cleanup

I would love to be able to run the distribution cleanup job with a switch that says cleanup all distributed transactions.

Because when I use peer to peer replication the @.allow_initialize_from_backup publication property is set to true which is good. But it has the down side that transactions are stored the max retention period in the distribution database. I want to use the deafault 72 hours for my retention period so that the subscritions don't get deactivated but in a system with a high transaction rate there wil be a lot of transactions in 72 hours. This means that the cleanup job will have a tough time to figure out which transactions to delete so the cleanup job will run for a long time not a very big problem but the problem is that the cleanup job will keep the log reader agent from delivering trtansactions to the distribution database and the subscribers won't get their data in time.

Could Microsoft please give me a switch so I can choose when I want to save my transaction and when I want to delete them as soon as they have been delivered to all subscribers?

Is this a feature in SQL Server 2008? Could it be released in SP3 for SQL Server 2005. (The SP 2 cleanup job has a bug so I have to use the SP 1 verison of the cleanup job)

just out of curiosity what is the bug in SP2 cleanup job? Do you happen to have a bug number or description, i can look it up and see if there's a workaround for your problem.|||

I don't have the bug number My tech lead at Microsoft will come back to me when the dev team has made the correction. The workaround I use is to run the SP1 versions of some cleanup procs.

This workaround atleast lets me delete the records older than max retention.

But I still would lit to be able to give two dates to the cleanup job. One for checking the subscription deactivation and one for transaction cleanup.