The beginning of a new year marks also the launch of this, a new SQL Server (and related ecosystem, of course!) blog with a focus on storage, and security.
Hopefully you will find these posts useful, and/or educational.
The beginning of a new year marks also the launch of this, a new SQL Server (and related ecosystem, of course!) blog with a focus on storage, and security.
Hopefully you will find these posts useful, and/or educational.
Unless you have been living under a rock, you’ve heard about the concept of contained users within a SQL Server database (hit the link if you have not). In this article I’d like to show you that you can, indeed, script out contained database users and recreate them on another database, as either contained users or as good old fashioned logins/server principals as well.
Why would this be useful? Well, because you would not need to know the password for the user in order to recreate it on another instance. I know there is a limited number of scenarios where this would be necessary, but nonetheless I figured I’d throw this blog post to show how it can be done.
A more obscure use case: with the password hash (which I’m about to show you how to obtain) you could also crack the password using a utility like hashcat, as highlighted on this SQLServerCentral article.
SQL Server uses System Base Tables to save the password hashes of logins and contained database users. For logins it uses sys.sysxlgns, whereas for contained database users it leverages sys.sysowners.
I’ll show you what I do to figure this stuff out: I create a login/contained user, and then I immediately browse the transaction log with, for example, fn_dblog.
It’s pretty obvious that only two base tables touched by the operation are sys.sysxlgns, and also sys.sysprivs – the latter is used to track permissions. If I connect to the DAC on my instance, I can query for the password hash of this login I’ve just created.
A few interesting things about this hash. This was taken on my laptop, and I happen to be running SQL Server 2014 RTM CU2, which is the latest public build of SQL Server 2014 as of time of writing. In 2008 R2 and prior versions (back to 2000), the password hashes would start with 0x0100. The reason why this changed is because starting with SQL Server 2012 password hashes are kept using a SHA512 algorithm, as opposed to SHA-1 (used since 2000) or Snefru (used in 6.5 and 7.0). SHA-1 is nowadays deemed unsafe and is very easy to crack.
For regular SQL logins, this information is exposed through the sys.sql_logins catalog view, so there is really no need to connect to the DAC to grab an SID/password hash pair. For contained database users, there is (currently) no method of obtaining SID or password hashes without connecting to the DAC.
If we create a contained database user, this is what we get from the transaction log:
Note that the System Base Table used in this case is sys.sysowners. sys.sysprivs is used as well, and again this is to track permissions.
To query sys.sysowners, you would have to connect to the DAC, as I mentioned previously. And this is what you would get:
There are other ways to figure out what SQL Server uses under the hood to store contained database user password hashes, like looking at the execution plan for a query to sys.dm_db_uncontained_entities (Thanks, Robert Davis!)
One of the reasons behind the existence of Contained Users was the concept of portability of databases: it is really painful to maintain Server Principals (Logins) synced across most shared-nothing SQL Server HA/DR technologies (Mirroring, Availability Groups, and Log Shipping). Often times you would need the Security Identifier (SID) of these logins to match across instances, and that meant that you had to fetch whatever SID was assigned to the login on the principal instance so you could recreate it on a secondary. With contained users you normally wouldn’t care about SIDs, as the users are always available (and synced, as long as synchronization takes place) across instances.
Now you might be presented some particular requirement that might specify that SIDs synced between logins on certain instances and contained database users on other databases. How would you go about creating a contained database user with a specific SID?
The answer is that you can’t do it directly, but there’s a little trick that would allow you to do it.
Create a login with a specified SID and password hash, create a user for that server principal on a partially contained database, then migrate that user to contained using the system stored procedure sp_user_migrate_to_contained, then drop the login.
CREATE LOGIN <login_name> WITH PASSWORD = <password_hash> HASHED, SID = <sid> ;
CREATE USER <user_name> FROM LOGIN <login_name>;
EXEC sp_migrate_user_to_contained @username = <user_name>, @rename = N’keep_name’, @disablelogin = N‘disable_login’;
DROP LOGIN <login_name>;
Here’s how this skeleton would look like in action:
And now I have a contained user with a specified SID and password hash. In my example above, I renamed the user after migrated it to contained so that it is, hopefully, easier to understand.
Even though way early in my career I had a SQL Server MVP as a mentor (by pure chance), it wasn’t until 2008-2009 that I sort of researched a bit more about the program, and what it really meant. I was beginning to understand how powerful networking in our community really is, and how attending and participating on SQL Saturdays and other events would make all the difference in my career.
I thought having that “badge” would be cool, but I really didn’t know how to get one.
I remember I asked K. Brian Kelley [twitter|blog], somebody who I consider a friend (even though I’ve never met him in person) how could I become a SQL Server MVP. He mentioned helping others through any vehicle (forums like MSDN’s, SQL Server Central, etc.) would help, but that I really needed to ask myself what it would mean to me to be an MVP. And I often wondered, really, what it all meant. Was it just an award given to somebody who would answer questions from others relentlessly? Would it be just some award given to members of this semi-secret inner circle?
Over time I became more and more involved with the community. I started speaking, and continued helping others, mostly through #SQLHelp on Twitter (funny how a hashtag changes your life!). I was seriously turned off by forums (still am) and didn’t care much about blogging very frequently. After all, other folks out there blog about just about every single topic on SQL Server, and I didn’t one to be just one more guy. Yet I blogged, but it was only when I thought my post could really make a difference to somebody facing a similar situation or working with an under-documented feature.
The fact that I tweeted incessantly helped get my name out there, and I kept being accepted to speak at SQL Saturdays and even the PASS Summit. Then I got my MCM certification, which really gave my name some exposure out there.
I was really becoming influential.
I eventually delivered pre-conferences at premier events like SQLBits, spoke at massive events like DevConnections and TechEd, and started toying with the idea of helping PASS even more. Then, with the help of Robert Davis [twitter|blog], and also from K. Brian Kelley himself, we launched the Security Virtual Chapter for PASS.
I was doing all this high profile stuff, which was fantastic, but I never lost sight of my roots: volunteering and helping others. I kept signing up to help PASS as member of the Program Committee, something that I have proudly done for 3 years in a row (and I’m not going to discuss current controversy on the subject here), helped moderate Lightning Talks at the Summit even when I wasn’t chosen as a speaker, spent hours talking to people at the Community Zone at the Summit and the BA Conference, moderated online web casts for the Security VC, and helped moderate online events like 24 Hours of PASS. And, all along, learning and sharing on #SQLHelp.
All the little things that not a lot of people sign up for (or avoid altogether), I happily help with.
Yet “MVPness” is not something that I feel entitled to. Not something that I demanded every quarter. I find it funny that I look forward to the MVP Summit taking place in Seattle because that means I get to hang out with that group – I have met a LOT of them along the way, and some of them I consider extremely close friends. These are people that I deeply respect. Still, I am not expecting an award by association.
I do the things I do for the SQL Server community because I love it, not because I want another three letter acronym on my resume.
Earlier on this year I was nominated for the award by several very influential folks in the community (Thank you!!). I seriously thought I wasn’t going to get it, because I thought I hadn’t done enough for the community to deserve it.
Today, I am _insanely_ honored to be recognized for my contributions to the community.
This is one badge I will wear with pride.
Today I was awarded the SQL Server MVP award.
And now the real work begins.
This award is yours, #SQLFamily, and also goes to the unsung heroes of this community that have been working hard for YEARS and still have not been awarded; you know who you are. Never give up, my friends. Stay true to your north.
I remember when back in 2001 my friend and former SQL Server MVP Carlos Eduardo Rojas was busy earning his MVP street-cred in the NNTP forums, aka Newsgroups. I always thought he was playing the Sheriff trying to put some order in a Wild Wild West town by trying to understand what these people were asking. He spent a lot of time doing this stuff – and I thought it was just plain crazy. After all, he was doing it for free. What was he gaining from all of that work?
It was not until the advent of Twitter and #SQLHelp that I realized the real gain behind helping others. Forget about the glory and the laurels of others thanking you (and thinking you’re the best thing ever – ha!), or whatever award with whatever three letter acronym might be given to you.
It’s about what you learn in the process of helping others.
See, when you teach something, it’s usually at a fixed date and time, and on a specific topic. But helping others with their issues or general questions is something that goes on 24×7, on whatever topic under the sun. Just go look at sites like DBA.StackExchange.com, or the SQLServerCentral forums. It’s questions coming in literally non-stop from all corners or the world. And yet a lot of people are willing to help you, regardless of who you are, where you come from, or what time of day it is.
And in my case, this process of helping others usually leads to me learning something new. Especially in those cases where the question isn’t really something I’m good at. The delicate part comes when you’re ready to give an answer, but you’re not sure. Often times I’ll try to validate with Internet searches and what have you. Often times I’ll throw in a question mark at the end of the answer, so as not to look authoritative, but rather suggestive. But as time passes by, you get more and more comfortable with that topic. And that’s the real gain.
I have done this for many years now on #SQLHelp, which is my preferred vehicle for providing assistance. I cannot tell you how much I’ve learned from it. By helping others, by watching others help. It’s all knowledge and experience you gain…and you might not be getting all that in your day job today. Such thing, my dear reader, is invaluable. It’s what will differentiate yours amongst a pack of resumes. It’s what will get you places. Take it from me – a guy who, like you, knew nothing about SQL Server.
There is a classic blog series by Paul Randal [blog|twitter] called “CHECKDB From Every Angle” which is pretty much mandatory reading for anybody who’s even remotely considering going for the MCM certification, or its replacement (the Microsoft Certified Solutions Master: Data Platform – makes my fingers hurt just from typing it). Of particular interest is the post “Consistency Options for a VLDB” – on it, Paul provides solid, timeless advice (I use the word “timeless” because it was written in 2007, and it all applies today!) on how to perform checks on very large databases.
Well, here I was trying to figure out how to make CHECKDB run faster on a restored copy of one of our databases, which happens to exceed 7TB in size. The whole thing was taking several days on multiple systems, regardless of the storage used – SAS, SATA or even SSD…and I actually didn’t pay much attention to how long it was taking, or even bothered to look at the reasons why – as long as it was finishing okay and found no consistency errors.
Yes – I know. That was a huge mistake, as corruption found in a database several days after taking place could only allow for further spread of the corruption – and potentially large data loss.
In the last two weeks I increased my attention towards this problem, as we noticed that CHECKDB was taking EVEN LONGER on brand new all-flash storage in the SAN! I couldn’t really explain it, and were almost ready to blame the storage vendor. The vendor told us that they could initially see the server driving decent I/O – around 450Mb/sec, and then it would settle at a very slow rate of 10Mb/sec or so. “Hum”, I thought – “CHECKDB is just not pushing the I/O subsystem hard enough”. Perfmon confirmed the vendor’s observations.
What was CHECKDB doing all the time while doing so little I/O? Eating Blobs.
It turns out that CHECKDB was taking an extremely long time on one of our frankentables, which happens to be have 35 billion rows (yup, with a b) and sucks up several terabytes of space in the database. We do have a project ongoing to purge/split/partition this table, so it’s just a matter of time before we deal with it.
But the reality today is that CHECKDB is coming to a screeching halt in performance when dealing with this particular table.
Checking sys.dm_os_waiting_tasks and sys.dm_os_latch_stats showed that LATCH_EX (DBCC_OBJECT_METADATA) was by far the top wait type. I remembered hearing recently about that wait from another post that Paul Randal made, but that was related to computed-column indexes, and in fact, Paul himself reminded me of his article via twitter. But alas, our pathologic table had no non-clustered indexes on computed columns.
I knew that latches are used by the database engine to do internal synchronization – but how could I help speed this up? After all, this is stuff that doesn’t have a lot of knobs to tweak.
(There’s a fantastic level 500 talk by Bob Ward from Microsoft CSS [blog|twitter] called “Inside SQL Server Latches” given at PASS 2010 – and you can check it out here. DISCLAIMER: I assume no responsibility for any brain melting that might ensue from watching Bob’s talk!)
Earlier on this week I flew down to Palo Alto, CA, to visit our Headquarters – and after having a great time with my Monkey peers, I was relaxing on the plane back to Seattle watching a great talk by SQL Server MVP and fellow MCM Maciej Pilecki [twitter] called “Masterclass: A Day in the Life of a Database Transaction” where he discusses many different topics related to transaction management inside SQL Server. Very good stuff, and when I got home it was a little late – that slow DBCC CHECKDB that I had been dealing with was way in the back of my head.
As I was looking at the problem at hand earlier on this week, I thought “How about I set the database to read-only?” I remembered one of the things Maciej had (jokingly) said in his talk: “if you don’t want locking and blocking, set the database to read-only” (or something to that effect, pardon my loose memory). I immediately killed the CHECKDB which had been running painfully for days, and set the database to read-only mode. Then I ran DBCC CHECKDB against it. It started going really fast (even a bit faster than before), and then throttled down again to around 10Mb/sec. All sorts of expletives went through my head at the time. Sure enough, the same latching scenario was present. Oh well.
I even spent some time trying to figure out if NUMA was hurting performance. Folks on Twitter made suggestions in this regard (thanks, Lonny! [twitter])
This past Friday I was still scratching my head about the whole thing; I was ready to start profiling with XPERF to see if I could figure out which part of the engine was to blame and then get Microsoft to look at the evidence.
After getting a bunch of good news I’ll blog about separately, I sat down for a figurative smack down with CHECKDB before the weekend. And then the light bulb went on.
A sparse column. I thought that I couldn’t possibly be experiencing the same scenario that Paul blogged about back in March showing extreme latching with non-clustered indexes on computed columns. Did I even have a non-clustered index on my sparse column?
As it turns out, I did. I had one filtered non-clustered index – with the sparse column as the index key (and only column).
To prove that this was the problem, I went and setup a test.
The repro is very simple for this issue: I tested it on the latest public builds of SQL Server 2008 R2 SP2 (CU6) and SQL Server 2012 SP1 (CU4).
First, create a test database and a test table, which only needs to contain a sparse column:
[sql]CREATE DATABASE SparseColTest; GO USE SparseColTest; GO CREATE TABLE testTable (testCol smalldatetime SPARSE NULL); GO INSERT INTO testTable (testCol) VALUES (NULL); GO 1000000[/sql]
That’s 1 million rows, and even though you’re inserting NULLs, that’s going to take a while. In my laptop, it took 3 minutes and 31 seconds.
Next, we run DBCC CHECKDB against the database:
[sql]DBCC CHECKDB(‘SparseColTest’) WITH NO_INFOMSGS, ALL_ERRORMSGS; [/sql]
This runs extremely fast, as least on my test rig – 198 milliseconds.
Now let’s create a filtered non-clustered index on the sparse column:
[sql]CREATE NONCLUSTERED INDEX [badBadIndex] ON testTable (testCol) WHERE testCol IS NOT NULL; [/sql]
With the index in place now, let’s run DBCC CHECKDB one more time:
[sql]DBCC CHECKDB(‘SparseColTest’) WITH NO_INFOMSGS, ALL_ERRORMSGS; [/sql]
In my test system this statement completed in 11433 milliseconds. 11.43 full seconds. Quite the jump from 198 milliseconds.
I went ahead and dropped the filtered non-clustered indexes on the restored copy of our production database, and ran CHECKDB against that. We went down from 7+ days to 19 hours and 20 minutes.
My pain is your gain, folks. Go check to see if you have any of such indexes – they’re likely causing your consistency checks to run very, very slow.
ps: I plan to file a Connect item for this issue – I consider it a pretty serious bug in the engine. After all, filtered indexes were invented BECAUSE of the sparse column feature – and it makes a lot of sense to use them together. Watch this space and my twitter timeline for a link.
Retaking on this blog series after an extended hiatus. I hope you will enjoy today’s topic on documentation.
One of the often neglected and yet extremely important tasks of any DBA is to document his/her environment. I can certainly relate to those of you who haven’t written a single page of documentation for your environment – I was there before. And I felt bad.
The good news is that documentation doesn’t have to be painful. My friends at Brent Ozar Unlimited recently posted a video on documentation – you can find it in this link – and they touch on that very same subject: There ARE ways to document your environment that will leave you greatly satisfied with the results, and some of you might even enjoy the process. Your manager will love you for it – and you won’t feel bad for the new guy who replaces you when it’s time to move on.
Not perfect. Let’s face it. Writing pages and pages of documentation for your database environment (or ANY IT environment) in Word is tiresome. Excel sheets can have tons of information and might be easy to peruse. But no matter how good your formatting is, how many pictures and Visio diagrams you throw at it, you’ll always get the sense that the document you spent days working on will just end up in the SharePoint site and collect dust there. SharePoint can handle changes and versioning for you, if you use it correctly – so that’s a plus. But think about ease of access to the documents – ease of editing. Discussions around it. Even automated updates! Is a SharePoint folder the way to go for that? (Or a UNC path, for those of you without SharePoint?)
You’ve been to WikiPedia. More than once. You know how useful it is, and how, thanks for the built-in search function and Search Engines, easy it is to find information on it.
How about using a wiki engine for your documentation? Can’t picture that in your head? Hear me out: wikis are VERY easy to create, edit and collaborate on – you just click on Edit and away you go. They have built-in version control, discussion pages, categories, lists, and pretty much everything you need to have a versatile documentation site.
Picture this: a wiki page for every environment, listing all the servers that belong to it – and linking to individual wiki pages with tons of information about those servers, including former incident highlights, detailed information about the server (CPUs, RAM, OS Version, SQL Instances, patches, team(s) who own the server, escalation paths for incidents, and much more.
The best part: these pages that have lots of information can be updated via an automated process. WHAT? AUTOMATED DOCUMENTATION YOU SAY? Yes. Think scripts. Scripts that will go over your environments on a regular basis and update information on them in the wiki. Does that sound cool or what? All you need is to understand the capabilities of your wiki engine (think API), and pair it up with your favorite scripting language. For me, it’s a slam-dunk: I’ll use PowerShell to automate the information gathering process, and a wiki engine that runs on SQL Server – I’ll show you a few options in a bit.
Now imagine that you give access to your NOC/Helpdesk to this wiki site: you don’t want them to change anything on it, so you set permissions for read-only access to the site. The benefits would be great: now the NOC has easy access to information on a given server that just alerted, and if you saved notes for that type of alert for that server, instructing Operators to collect certain information before they call you, that can save precious time and help you troubleshoot the issue. You might even let the NOC track incidents on a particular section of the wiki page for that server, so you can figure out if there is a pattern that you need to address.
YMMV on your organization’s needs – but this can be a very valuable tool to keep your business running efficiently. I have worked for teams that relied heavily on Wikis for documentation – and saw first hand how this greatly improves access to information for our team. How you use it in your environment and how you adapt it to your requirements is entirely up to you – I only wanted to give you a few ideas on how you might put this to your advantage.
There are TONS of wiki engines out there: WikiPedia has a page (not frequently updated, alas) that lists some of them. Being the SQL Server guy that I am, I’m naturally inclined to use something powered by SQL in the backend. The most widely used wiki engine, developed by the Wikimedia Foundation, is a typical LAMP (Linux/Apache/MySQL/PHP) application. If you have experience with Apache, PHP and MySQL, you should use that. It’s updated very frequently, and it’s got a large community behind it.
Now, if you have SharePoint, you may or may not know that it comes with a wiki engine as well – and that’s detailed here.
Recently I used FlexWiki , which uses SQL Server as a backend. Being a SQL database on the backend, you back it up and keep it safe as the rest of your backups – just don’t use the same servers you’re trying to document to host the wiki engine’s database. If you lose that server, you lost access to the documentation about it.
Unfortunately, FlexWiki isn’t being actively maintained. But look at the list of Wiki software – you’re bound to find something that fits your needs.
If you have no documentation/poor documentation. Go take care of it, now. Go create and/or reuse scripts that gather information about your environment. Be proactive about it – make sure those scripts get updated. Show your work to your peers and your boss. Feel confident that you have control of your environment, and rock on!
I’d love to hear what you’re doing in your environment as far as documentation goes. Will you leave some comments below? Also, if you feel like giving some feedback on the DBA Best Practices series, I’d appreciate it.
Until next time,
If you were looking for Paul Randal’s [Blog|Twitter] DR poster and couldn’t find it anywhere on the net, I have good news for you: I saved a copy on my PC before it was taken down, and (with Paul’s permission) wanted to make it available here.
So, without further ado, enjoy the poster: DR Poster-V2.
(I’m taking a break from my DBA Best Practices blog series to talk about this today – the series continues after the Holidays!)
I love to be proved wrong. Really, I do.
Recently we had a discussion in an distribution list where somebody asked whether a SQL Server 32-bit instance could address more than 4Gb of RAM when running on top of a 64-bit OS. One of the really smart guys in Microsoft SQL Server Support, Karthick P.K. [Blog|Blog|Twitter] replied “sure, just add more RAM and enable AWE – SQL will use that memory”. I was much convinced that this was incorrect, so I jumped in and said that AWE does nothing under those circumstances. After all, that is what I had read in the past in many different articles. I even had some SQL Server MVPs who have been around the block forever agreeing with me.
But the good news is that I was wrong.
AWE in fact does nothing on 64-bit instances of SQL Server. But on WOW64 – Windows on Windows 64-bit, the subsystem of Windows x64 that allows you to run 32-bit processes and instances of SQL Server, AWE does allow you to address memory above 4Gb. Let me show you.
Side note: AWE is now deprecated – removed in SQL Server 2012. The last version of SQL Server that supports AWE is SQL Server 2008 R2. Because of this, a SQL Server 2012 x86 instance won’t be able to enjoy over 4Gb of memory – even if running on WOW64.
I setup a vanilla VM in my lab with a 64-bit OS with 8Gb of memory. MSINFO32 looks like this on the VM:
I installed a 32-bit SQL Server 2008 R2 SP2 instance on the VM. @@VERSION looked like this:
I did not enable AWE. I did grant the “Perform Volume Maintenance Tasks” and “Lock Pages in Memory” privileges using secpol.msc (not pictured!). I wanted to see what memory consumption would look like without AWE enabled.
This screenshot shows the permissions granted to the SQL Server Service Account (NETWORK SERVICE, in this case) – used “whoami /priv”, a quick and dirty way to get the privileges of the service account using xp_cmdshell. Ugh, xp_cmdshell. It’s okay, it’s just a test.
I went ahead and created a test database with just one big table – big enough to use all the buffer pool memory on the instance when a size-of-data operation is executed on it. I did set Max Server Memory to 6Gb.
Here’s the CREATE table script:
[sql]CREATE TABLE [dbo].[testTable]( [id] [int] IDENTITY(1,1) NOT NULL, [wideColumn] [nvarchar](4000) NULL, PRIMARY KEY CLUSTERED ( [id] ASC )WITH (PAD_INDEX = OFF, STATISTICS_NORECOMPUTE = OFF, IGNORE_DUP_KEY = OFF, ALLOW_ROW_LOCKS = ON, ALLOW_PAGE_LOCKS = ON) ON [PRIMARY] ) ON [PRIMARY] GO ALTER TABLE [dbo].[testTable] ADD DEFAULT (replicate(N’A’,(4000))) FOR [wideColumn] GO[/sql]
Next, I went ahead and inserted 1M rows on the table:
[sql]INSERT INTO [dbo].[testTable] DEFAULT VALUES GO 1000000[/sql]
If you do the math very quickly, you will agree with me that this not so little table will be more than 6Gb in size. Let’s check anyway:
Cool! The table is large enough for the purpose of this test. But how does memory utilization look like after that 1M row insert?
That’s less than 4Gb. Next I turned on AWE, and restarted the instance. Then I queried the ERRORLOG to confirm that AWE was enabled:
Okay, since I had just restarted the instance my buffer pool was gone – needed to force a size of data operation on the large table. Rebuilding the clustered index did just fine (I know, that’s probably overkill). Then I checked sys.dm_os_process_memory again:
And there we go! A lot more than 4Gb being used now that AWE is in play. I had set Max Server Memory to 6Gb, as I detailed earlier.
It looks like there’s quite a lot of incorrect documentation regarding this, so I wanted to clear out any confusion – and learn something new along the way. Don’t you love SQL Server?
Happy Holidays and Happy New Year!
P.S.: If you are running IA64 (Itanium) – then this doesn’t apply to you. AWE does not work at all on IA64.
One of the biggest digital assets that any company has is its secrets. These include passwords, key rings, certificates, and any other digital asset used to protect another asset from tampering or unauthorized access.
As a DBA, you are very likely to manage some of these assets for your company – and your employer trusts you with keeping them safe. Probably one of the most important of these assets are passwords. As you well know, the can be used anywhere: for service accounts, credentials, proxies, linked servers, DTS/SSIS packages, symmetrical keys, private keys, etc., etc.
Have you given some thought to what you’re doing to keep these passwords safe? Are you backing them up somewhere? Who else besides you can access them?
If you have a password-protected Excel sheet for your passwords, I have bad news for you: Excel’s level of encryption is good for your grandma’s budget spreadsheet, not for a list of enterprise passwords.
I will try to summarize the main point of this best practice in one sentence: You should keep your passwords on an encrypted, access and version-controlled, backed-up, well-known shared location that every DBA on your team is aware of, and maintain copies of this password "database" on your DBA’s workstations.
Now I have to break down that statement to you:
– Encrypted: what’s the point of saving your passwords on a file that any Windows admin with enough privileges can read?
– Access controlled: This one is pretty much self-explanatory.
– Version controlled: Passwords change (and I’m really hoping you do change them) and version control would allow you to track what a previous password was if the utility you’ve chosen doesn’t handle that for you.
– Backed-up: You want a safe copy of the password list to be kept offline, preferably in long term storage, with relative ease of restoring.
– Well-known shared location: This is critical for teams: what good is a password list if only one person in the team knows where it is?
I have seen multiple examples of this that work well. They all start with an encrypted database. Certainly you could leverage SQL Server’s native encryption solutions like cell encryption for this. I have found such implementations to be impractical, for the most part.
There are a myriad of open source/free software solutions to help you here. One of my favorites is KeePass, which creates encrypted files that can be saved to a network share, Sharepoint, etc. KeePass has UIs for most operating systems, including Windows, MacOS, iOS, Android and Windows Phone.
There are, of course, even more "enterprise-level" solutions available from 3rd party vendors. The truth is that most of the customers that I work with don’t need that level of protection of their digital assets, and something like a KeePass database on Sharepoint suits them very well.
What are you doing to safeguard your passwords? Leave a comment below, and join the discussion!
This blog post is part of the DBA Best Practices series, on which various topics of concern for daily database operations are discussed. Your feedback and comments are very much welcome, so please drop by the comments section and be sure to leave your thoughts on the subject.
When I was a DBA, the first thing I did when I sat down at my desk at work was checking that all backups had completed successfully. It really was more of a ritual, since I had a dual system in place to check for backup completion: 1) the scheduled agent jobs to back up the databases were set to alert the NOC in failure, and 2) I had a script run from a central server every so often to check for any backup failures.
Why the redundancy, you might ask. Well, for one I was once bitten by the fact that database mail doesn’t work 100% of the time. Potential causes for failure include issues on the SMTP box that relays your server email, firewall problems, DNS issues, etc. And so to be sure that my backups completed fine, I needed to rely on a mechanism other than having the servers do the taking – I needed to interrogate the servers and ask each one if an issue had occurred. This is why I had a script run every so often.
Some of you might have monitoring tools in place like Microsoft System Center Operations Manager (SCOM) or similar 3rd party products that would track all these things for you. But at that moment, we had no resort but to write our own Powershell scripts to do it.
Now it goes without saying that if you don’t have backups in place, you might as well find another career. Your most sacred job as a DBA is to protect the data from a disaster, and only properly safeguarded backups can offer you peace of mind here.
Sadly I’ve heard this line more than I would have liked to. You need to understand that a cluster is comprised of shared storage, and that is precisely your single point of failure. A cluster will protect you from an issue at the Operating System level, and also under an outage of any SQL-related service or dependent devices. But it will most definitely NOT protect you against corruption, nor will it protect you against somebody deleting data from a table – accidentally or otherwise.
The answer to this is something you will hear frequently when working with databases: it depends.
What does it depend on? For one, you need to understand how much data your business is willing to lose. This is what’s called Recovery Point Objective, or RPO. If you don’t know how much data your business is willing to lose, you need to have an honest and realistic conversation about data loss expectations with your customers, internal or external. From my experience, their first answer to the question “how much data loss can you withstand?” will be “zero”. In that case, you will need to explain how zero data loss is very difficult and very costly to achieve, even in today’s computing environments.
Do you want to go ahead and take full backups of all your databases every hour, or even every day? Probably not, because of the impact that taking a full backup can have on a system. That’s what differential and transaction log backups are for.
Have I answered the question of how often to take a backup? No, and I did that on purpose. You need to think about how much time you have to recover from any event that requires you to restore your databases. This is what’s called Recovery Time Objective. Again, if you go ask your customer how long of an outage they can withstand, at first you will get a completely unrealistic number – and that will be your starting point for discussing a solution that is cost effective.
The point that I’m trying to get across is that you need to have a plan. This plan needs to be practiced, and tested. Like a football playbook, you need to rehearse the moves you’ll perform when the time comes. How often is up to you, and the objective is that you feel better about yourself and the steps you need to follow when emergency strikes.
Backups are files. Files are prone to corruption. Put those two together and realize how you feel about those backups sitting on that network drive. When was the last time you restored any of those?
Restoring your backups on another box – that, by the way, doesn’t have to match the specs of your production server – will give you two things: 1) peace of mind, because now you know that your backups are good and 2) a place to offload your consistency checks with DBCC CHECKDB or any of the other DBCC commands like CHECKTABLE or CHECKCATALOG. This is a great strategy for VLDBs that cannot withstand the additional load created by the consistency checks.
If you choose to offload your consistency checks to another server though, be sure to run DBCC CHECKDB WITH PHYSICALONLY on the production server, and if you’re using SQL Server 2008 R2 SP1 CU4 and above, be sure to enable traceflags 2562 and/or 2549, which will speed up the PHYSICALONLY checks further – you can read more about this enhancement here.
Back to the “How Often” question for a second. If you have the disk, and the network latency, and the system resources to do so, why not backup the transaction log often? As in, every 5 minutes, or even less than that? There’s not much downside to doing it, as you will have to clear the log with a backup sooner than later, lest you risk running out space on your tlog, or even your drive. The one drawback to this approach is that you will have more files to deal with at restore time, and processing each file will add a bit of extra time to the entire process. But it might be worth that time knowing that you minimized the amount of data lost. Again, test your plan to make sure that it matches your particular needs.
This is another topic where everybody has a favorite choice. So, I’ll stick to mentioning what I like to do and what I consider to be the best practice in this regard. I like to backup to a SAN volume, i.e., a drive that actually lives in the SAN, and can be easily attached to another server in a pinch, saving you valuable time – you wouldn’t need to restore files on the network (slow) or pull out drives out a dead server (been there, done that, it’s also slow!).
The key is to have a copy of those backup files made quickly, and, if at all possible, to a remote target on a different datacenter – or even the cloud. There are plenty of solutions out there that can help you put such a solution together. That right there is the first step towards a practical Disaster Recovery plan. But there’s much more to DR, and that’s material for a different blog post in this series.