So speaking to any smart database people they will generally tell you that the SELECT * FROM ... is a bad idea and should not be used. Today (well would of been if I didn't procrastinate) I spent many hours debugging something from the wrong places because I used SELECT *. Below is what I found out =).
Rennie Moodley helped me understand what was happening after I figured out where the problem was, helped me create the sample code below and then the 2 of us and Christopher Swart poked at and improve the sample and found some more weirdness which you'll also see below.
For the setup of the sample you can use any database, the sample code contains everything to create a SampleDb and then drop it in the end.
So for the initial bit of weirdness that cost me a couple hours I basically had a table and 2 views (but the scenario can play out with a table and 1 view as we'll do in this post). It looked something like below
With this in place we are able to do a select and notice that we do have data
I then somewhere along the line added another column to the view (in this case table) like so
added a reference to this column in my code, got side tracked and then after lunch and a meeting come back to the code where I developed in some other areas of the app for something that I remembered about while away from my pc and then eventually came back to this area of code which by now I lost complete context of what I was doing. I ran the code and received a weird error saying that the column didn't exist. I double checked the view with sp_helptext to make sure it was in the database (only part that doesn't work with this sample because sp_helptext doesn't support tables). I thought at this stage maybe the view had the column specified (which was not like me because I'm a lazy developer), so I did a sp_helptext on the view.
which showed me that the view still had a star
but why then did the results show otherwise?
After as mentioned many hours of looking at this from checking that the app was till configured to hit this database and not a random other environment to restarting my PC. I reinstalled all the objects and by doing that lost all my sample data. Which we would simulate by dropping the table and view and then recreating the table with the extra column and then creating the view as we previously did
This now returned the data I was expecting
So what was different now? The code looked the same so what could it be. At the point the non lazy developer on a lot of energy drinks kicked in and I started digging which made me revert the table changes and reinstall everything again and then made the single column add as above which I confirmed made the weirdness return which for completeness looks like this
and then running that select
It was at this point that I realized that reinstalling the view fixed my problem probably. To try this you can either drop and create the view or simple ALTER it which is the most likely way you would do this in production for most peopleand then once again we run the select
and we now have that extra column.
At this point I called Rennie over and showed him the weirdness. After going for some coffee I found in my inbox a mail containing a sample script giving me some insight. The key pieces to note was that querying the information schema before running the ALTER
revealed that the column wasn't there
whereas running that ALTER statement again
So this again illustrated that what the ALTER achieved, he also mentioned that we can use sp_refreshview to refresh the view
At this point we did what is the best feeling in finding a bug like this in your code, we poked to see what else we (I) could learn from this which also revealed some almost more bazar then the views caching the columns it returned.
What happens if you rename one of the columns
we are given a warning saying that we could be breaking stuff
and so we ran the select and I was thinking that it would throw an error
and we noticed that the sp_rename didn't seem to work or at least that's what I would have thought before our earlier discovery. So just to double check we looked at the information schema again
and thought it would show us the same result as earlier with a row for A, B and C but instead noted that D was showing here
now I was thinking where could it be keeping the info for the views columns then and more importantly how do we get a A column back from the result when it doesn't exist. We looked next turned to sys.views and sys.columns
this showed us the A column still being there as part of what SQL knew to be in the view
so how does the view get the data which was in column A and is now in column D? Well for this we turn back to sp_refreshview, information schema, sys.views and sys.columns
which shows us
Now this is the part I looked at by myself so I can only assume that SQL is smart here and uses the column id/pos when it queries and labels the result as what it currently believes to be the column name
Dropping and Adding Column
So we knew how the view would handle column renames but what about dropping and creating columns. For this we dropped the column B and added the column E
run the view select againand as we could almost expect from this post column E didn't exist but what about column B? It should exist right be cause haven't refreshed the view cache but how could it because it was now dropped from the database. This most weirdly for me returns the column (which I guess could be expected) but now we have no data for column B
we run the magic again to see what SQL thinks is going onand notice now that information schema says we have no column B and also no column E and the sys.views and sys.columns query shows D, B and C rows
a sp_refreshview and select
shows the data as we would now expect it
Now again here I am assuming what SQL does and here I think it must obviously be using sys.columns for what columns to show and then going to information schema for where to find the data, naturally not find a matching column would return null data. However looking at the result above showing the queries off information schema, sys.views and sys.columns disproves my previous idea about how it matches the columns. Maybe it checks if the same amount of columns exist then use order matching otherwise the name matching or something silly like that but then that wouldn't work for a rename and drop and create without refreshing the view
Column Drop, Add and Rename
Before we see if anything weird happens here will quickly top up our table with data to see for NULL's, for that we'll just TRUNCATE and then INSERT rows as before
and we now have the data
and this is were things get very weird for me (could also be because it's late and my brain is tired) but if we drop, add and rename a column
and run our select
we expect that column C would contain NULL data as before but instead column E has the NULL data. Looking back to our new favorite place
we notice that column C has been removed as expected. If we run our refresh and select again
We see that now column E has data and the old column D has been renamed to G and has data. From this my new assumption is like my old one and that is that SQL get's the column pos it needs data for from information schema and then get's the label for those columns from that same position in sys.columns.
Dropping columns without adding any
So what happens when we drop a column but don't add another? Let's test that, we are going to drop the column E because it feels like it won't make any difference
we run the magic
now based on our last theory we were saying that it would get columns to get data from from Information Schema and labels from sys.columns so you would almost expect a similar result as Information Schema has 2 results and sys.columns has 3 to before with the last column being NULL. Let's run that select
And instead of getting a NULL value at the end we get an error
Msg 4502, Level 16, State 1, Line 396
View or function 'dbo.vw_Test' has more column names specified than columns defined.
Now I haven't figured how exactly why in the previous example SQL just puts a column with NULL data and in this one it throws an error. I know it's obvious that it's because there isn't a third column but it would be nice to find the place SQL looks to know the this which I will continue looking for =) and of course if we run the select now
we have a working view again
If I were not to use a SELECT *
A little clean up is needed here again. Lets drop the table and view and create them as they existed in the start of this post
now alter the view to use column names instead of *
And the select just to verify data
Add a column
Now let's do the add a column
The command completes as expected, run the select
We are given just our 2 columns as expected. Although this didn't magically give us the 3rd column if we do a sp_helptext on the view we will see that we are only asking for column A and B. Now if we ALTER the view to include column E
and run the select
We are given the extra column and nothing has surprised us yet
Dropping and Adding a column
So now that we have added a column and see that nothing weird happens let's try the dropping and adding a column example. We'll drop column B and add a column C
Now if you remember when we did this earlier we got funny data with column names that didn't match the data. Let's run the select
We get the error
Msg 207, Level 16, State 1, Procedure vw_Test, Line 472
Invalid column name 'B'.
Msg 4413, Level 16, State 1, Line 470
Could not use view or function 'dbo.vw_Test' because of binding errors.
We now get what you'd expect because that column doesn't exist. no chance of data getting "corrupt" by showing us data under the wrong heading and of course we can fix this by altering the view to drop the column B and add column C
and then running the select would show us data again
No surprises here either
To clear out everything we created we can just drop the database or if you used an existing database you can just drop the table and view
Think back to some of the weird SQL errors that you've had in the past, I can think of a couple of applications where I've come across this error and never dug to find out why it was working. Probably to busy chasing deadlines to learn.
This is also unfortunately one of the times where "Have you turned if off and on again" will not help as after restarting SQL you will still get the "wrong" data
From all of this you can see that using SELECT * is a bad idea, in most cases if you change a view you wouldn't go and refresh, alter or drop and create every other view in your database as that makes no sense at all.
Found this tweet a while ago about #SelectStar
A sample script with everything above including all the selects that weren't printed out but mentioned and all the "clean up" data can be found in my Git Hub Gists https://gordonbeeming.xyz/gist/aa8e5cf3dfee66f21bcb.