2

I'm working on an analyzer that accepts a SQL Server query and returns information about all columns in the first result set - until for clause is used, this is pretty straightforward with sp_describe_first_result_set. However, the queries I expect heavily use for json and sp_describe_first_result_set reports columns using this clause (correctly) as nvarchar. I've tried rolling my parser with SqlScriptDOM, but parsing got complicated fast. I've also toyed with restructuring the inner queries via said parser so sp_describe_first_result_set could be run against them (basically extracting them to be top-level). This approach also hit the complexity wall.

Is there any reasonable way to achieve this? When working in JetBrains' IDEs, I've checked that their built-in SQL query editor has this functionality, as hovering over a star in an inner query shows the exact columns to be selected in a tooltip.

2
  • Would appreciate someone letting me know whether I have to pull trough parsing the AST myself, or whether there is something out there which would save a lot of work. Thank you. Commented Jun 7 at 18:08
  • 4
    I'm a little confused: what exactly is not working? Do you have a query which sp_describe_first_result_set doesn't work with? Or is it your own parser that's the problem? How is SqlScriptDOM relevant? Please show an example query, along with what you are trying to achieve, and what you are currently doing that has the issue. Commented Jun 9 at 0:26

2 Answers 2

2

A SqlScriptDOM visitor can be used to easily find FOR JSON clauses within a T-SQL script. This will facilitate creating a new script without FOR JSON for use by sp_describe_first_result_set to retrieve result set meta data that would otherwise be obscured by the json result.

Below is a .NET console application example that uses a visitor to find FOR JSON clauses in a T-SQL script, creates a new script without those clauses, and finally invokes sp_describe_first_result_set to load result set meta data into a data table.

See this article for more information on the T-SQL Script DOM and Visitor.

using Microsoft.Data.SqlClient;
using Microsoft.SqlServer.TransactSql.ScriptDom;
using System.Data;
using System.Diagnostics;

string tSqlScriptWithForJson = "SELECT * FROM sys.tables FOR JSON PATH;";
string connectionString = "Data Source=.;Integrated Security=SSPI;Initial Catalog=tempdb;TrustServerCertificate=true";

var example = new VisitorExample();
//use visitor to get script without FOR JSON clauses
string tSqlScriptWithoutForJson = example.GetScriptWithoutForJson(tSqlScriptWithForJson);
Debug.WriteLine($"Original script: {tSqlScriptWithForJson}");
Debug.WriteLine($"Original script: {tSqlScriptWithoutForJson}");

//execute sp_describe_first_result_set without JSON FOR cleause to load result set meta data into data table
using (var connection = new SqlConnection(connectionString))
using (var command = new SqlCommand("sp_describe_first_result_set", connection) { CommandType = System.Data.CommandType.StoredProcedure })
{
    var metaData = new DataTable();
    command.Parameters.Add("tsql", System.Data.SqlDbType.NVarChar, -1).Value = tSqlScriptWithoutForJson;
    connection.Open();
    metaData.Load(command.ExecuteReader());

}

//A visitor class derives from TSqlConcreteFragmentVisitor or TSqlFragmentVisitor.
//The overloaded Visit method is invoked whenever a fragment of the specified type is visited after the Accept method call.
public class VisitorExample : TSqlConcreteFragmentVisitor
{
    // visited JSON FOR clause fragments
    public List<JsonForClause> jsonForClauses = new List<JsonForClause>();

    // get T-SQL script with JSON FOR clauses removed
    public string GetScriptWithoutForJson(
        string tSqlScript
    )
    {

        var parser = new TSql170Parser(true, SqlEngineType.All);
        IList<ParseError> parseErrors = new List<ParseError>();

        //create TextReader with script for parsing
        TextReader stringReader = new StringReader(tSqlScript);
        //parse script to get fragments and tokens
        var tSqlScriptFragment = (TSqlScript)parser.Parse(stringReader, out parseErrors);
        if (parseErrors.Count > 0)
        {
            //throw custom exception with all parsing errors
            throw new TSqlParseException(tSqlScriptFragment, parseErrors);
        }

        //invoke visitor to find JSON FOR clause fragments
        tSqlScriptFragment.Accept(this);

        //return script without JSON FOR clauses
        return getScriptWithoutJsonForClauses(tSqlScriptFragment);

    }

    public override void Visit(JsonForClause sqlFragment)
    {

        //add FOR JSON fragments to list - these will be removed in the final script
        jsonForClauses.Add(sqlFragment);

    }

    private string getScriptWithoutJsonForClauses(TSqlFragment tSqlScriptFragment)
    {

        var scriptWithoutJsonForClauses = new StringWriter();

        for (int tokenIndex = tSqlScriptFragment.FirstTokenIndex; tokenIndex <= tSqlScriptFragment.LastTokenIndex; ++tokenIndex)
        {

            if (!isTokenInForJsonClause(tokenIndex))
            {
                //keep original script text if not in a JSON FOR clause, otherwise ignore
                scriptWithoutJsonForClauses.Write(tSqlScriptFragment.ScriptTokenStream[tokenIndex].Text);
            }

        }

        return scriptWithoutJsonForClauses.ToString();

    }

    //return true if token is in a FOR JSON clause
    private bool isTokenInForJsonClause(int tokenIndex)
    {

        foreach (var jsonForClauseFragment in jsonForClauses)
        {
            if (tokenIndex >= jsonForClauseFragment.FirstTokenIndex && tokenIndex <= jsonForClauseFragment.LastTokenIndex)
            {
                return true;
            }
        }
        return false;

    }

}

//custom class for parser exceptions
public class TSqlParseException : Exception
{

    private TSqlScript tSqlScriptFragment;
    private IList<ParseError> parseErrors;

    public TSqlScript TSqlScriptFragment { get => tSqlScriptFragment; }
    public IList<ParseError> ParseErrors { get => parseErrors; }

    public TSqlParseException(TSqlScript tSqlScriptFragment, IList<ParseError> parseErrors) : base($"{parseErrors.Count} error{(parseErrors.Count == 1 ? "" : "s")} parsing T-SQL script.")
    {
        this.tSqlScriptFragment = tSqlScriptFragment;
        this.parseErrors = parseErrors;
    }

}
2
  • Thanks for replying. That's basically what I did initially; however, this approach won't work for correlated queries, etc. Consider a case in which the inner query depends on a table from the outer scope. In this case, we can't extract the inner query (at least in this naive manner). Commented Jun 11 at 12:55
  • 1
    @MatějŠtágl, not sure I understand what you mean about extracting the inner query. I understood from your question you wanted to get meta data (column names and data types) of the result exposed by the (outermost) query so correlated queries are not relevant. Add examples of problem queries to your question and expected results. Commented Jun 11 at 13:31
1

Assuming I'm reading your question correctly - it is not possible for you to satisfy this scenario in a sensible way.

JSON is - as you've noticed - just text. How you translate back from text to strict data types is an arbitrary choice. Consider:

id json
1 {"a":1,"b":["x","y"]}
2 {"b":2,"c":{"z":1}}

You cannot be sure any pair of JSON payloads in a common column will share a common schema. Even if you were to...

  1. scan all text columns for json eligibility
  2. take a sampling of the values in those columns
  3. write some logic to infer what the common schema might be among them

...you'd still be limited by your own conflict resolution choices in point 3. And this is before even considering the added computational cost of doing all that tomfoolery past the first level of parsing.

...in JetBrains' IDEs ... their built-in SQL query editor has this functionality.

Assuming you're talking about a feature similar to the below screencap...

SSMS, RedGate SQL Compare plugin - hovering cursor over * opens tooltip showing column names, types, and nullability.

...this feature is backed by checking the rich column types that still exist before any transformation to JSON.


You can get "clever" and try to infer JSON & typing with something like openjson, but as noted, this will be expensive and unreliable in even the most trivial edge cases.

3
  • 2
    Side note: SQL Server 2025 adds a native JSON type (so not just text) Commented Jun 10 at 0:12
  • TIL. There's still no inferable schema from a document's content though; so native type support doesn't allow you to reverse engineer attribute metadata without something external like jsonschema. Commented Jun 10 at 15:04
  • Thanks for the effort put into this answer; I appreciate the affirmation that there is no easy way around it. I'm now immersed in the way @Dan Guzman started, getting it work with the concerns I raised there has been a lot of pain already, but that's my problem. I've upvoted your answer, but won't be accepting it, as it's not a solution (and given this is basically roll your own parser, I don't expect anyone to spoon-feed me the entirety of it). Commented Jun 11 at 12:59

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.