Skip to content

For extra metadata lookup#263

Closed
v-nisidh wants to merge 2 commits intomicrosoft:devfrom
v-nisidh:For_#244_Extra_Metadata_lookup
Closed

For extra metadata lookup#263
v-nisidh wants to merge 2 commits intomicrosoft:devfrom
v-nisidh:For_#244_Extra_Metadata_lookup

Conversation

@v-nisidh
Copy link
Copy Markdown
Contributor

@v-nisidh v-nisidh commented Apr 25, 2017

Fixes #244.

This is breaking change so I used System properties / Environment properties to enable this change.

By using following property one can save one roundtrip to server for fetching metadata for parameternames in stored procedure.
com.mssql.metadata.lookup=N

Although it works with my unit test cases following scenarios will break existing behavior.

  1. Provide parameter names not in order for your stored procedure.
  2. Use index as well as parameter names for stored procedure.

By default it will assume com.mssql.metadata.lookup=Y. In order to change this behavior you can set System / environment property. System property will get precedence.

You can test it by adding following VM arguments too.
-Dcom.mssql.metadata.lookup=N

TODO:

  • Add Unit Test case
  • Instead of system properties use connection parameters
  • Call Stored Procedure directly
  • Create WIKI page for driver behavior

Adding code to avoid extra metadata lookup for Store Procedure with name
param
Remove Utils.java as this might confuse with our Util.java.. Added
Parameterized Unit Test & updated dependency in pom file.
@codecov-io
Copy link
Copy Markdown

codecov-io commented Apr 25, 2017

Codecov Report

Merging #263 into dev will increase coverage by 0.02%.
The diff coverage is 19.4%.

Impacted file tree graph

@@             Coverage Diff              @@
##                dev     #263      +/-   ##
============================================
+ Coverage     34.34%   34.36%   +0.02%     
- Complexity     1553     1561       +8     
============================================
  Files           101      101              
  Lines         23656    23697      +41     
  Branches       3878     3889      +11     
============================================
+ Hits           8124     8144      +20     
- Misses        13936    13957      +21     
  Partials       1596     1596
Flag Coverage Δ Complexity Δ
#JDBC41 34.22% <18.65%> (+0.02%) 1552 <12> (+9) ⬆️
#JDBC42 34.28% <19.4%> (+0.04%) 1559 <12> (+13) ⬆️
Impacted Files Coverage Δ Complexity Δ
...oft/sqlserver/jdbc/SQLServerCallableStatement.java 15.35% <14.4%> (+1.53%) 49 <8> (+10) ⬆️
...c/main/java/com/microsoft/sqlserver/jdbc/Util.java 42.22% <88.88%> (+1.09%) 57 <4> (+4) ⬆️
...m/microsoft/sqlserver/jdbc/SQLServerResultSet.java 25.83% <0%> (-0.71%) 181% <0%> (-6%)
...om/microsoft/sqlserver/jdbc/ReaderInputStream.java 44.94% <0%> (ø) 16% <0%> (ø) ⬇️
...om/microsoft/sqlserver/jdbc/SQLServerBulkCopy.java 46.24% <0%> (ø) 193% <0%> (-1%) ⬇️
...in/java/com/microsoft/sqlserver/jdbc/IOBuffer.java 38.87% <0%> (+0.1%) 0% <0%> (ø) ⬇️
...rc/main/java/com/microsoft/sqlserver/jdbc/DDC.java 28.57% <0%> (+0.22%) 51% <0%> (-1%) ⬇️
src/main/java/microsoft/sql/DateTimeOffset.java 40% <0%> (+2.85%) 10% <0%> (+2%) ⬆️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 18a9265...354d845. Read the comment docs.

@Suraiya-Hameed
Copy link
Copy Markdown
Contributor

sp_sproc_columns gets called on accessing stored procedures with index.

@Suraiya-Hameed
Copy link
Copy Markdown
Contributor

If call to the storedProcedure has parameter name in it, I would expect it to work either if the parameters are set in-order or out-of-order.

This below snippet doesn't work with these changes,

cstmt = conn.prepareCall("exec dbo.sp @p1=?, @p2=?");
cstmt.registerOutParameter("p2", java.sql.Types.INTEGER);
cstmt.setString("p1", "xyz");

@v-nisidh
Copy link
Copy Markdown
Contributor Author

v-nisidh commented Apr 26, 2017

@v-suhame :

As we are not calling DatabaseMetaData.getProcedureColumns() we can not predict order of parameters. So if we want to omit metadata call to fetch procedure columns then we have to rely on end user to provide parameters in correct order. And due to this it's a breaking change.

If call to the storedProcedure has parameter name in it, I would expect it to work either if the parameters are set in-order or out-of-order.
This below snippet doesn't work with these changes,

cstmt = conn.prepareCall("exec dbo.sp @p1=?, @p2=?");
cstmt.registerOutParameter("p2", java.sql.Types.INTEGER);
cstmt.setString("p1", "xyz");

@v-nisidh
Copy link
Copy Markdown
Contributor Author

v-nisidh commented Apr 27, 2017

@v-suhame
If we used index & param names both for same CallableStatement then either we will get an error or it will call sp_proc_columns to fetch procedure columns.

  1. if we used registerOutputParameter with an index & set parameter by parameter name or vice-a-versa then we will get an exception.
  2. If we used index for setting up parameters but while retrieving we used parameter name then driver is not aware of parameter names as we used index to set parameters then it will fetch metadata column names to fetch data by sp_proc columns

Working Example:

          Properties props = System.getProperties();
          props.setProperty("com.mssql.metadata.lookup", "N");

          cstmt=conn.prepareCall("{call dbo.GetAddressCount(?,?)}");
          // cstmt.setString(1, "Bothell");
          // cstmt.registerOutParameter(2, java.sql.Types.INTEGER);
         
          cstmt.setString("City", "Bothell");
          cstmt.registerOutParameter("AddressCount", java.sql.Types.INTEGER);
          cstmt.execute();

          //   System.out.println("Output from procedure:\t" + cstmt.getInt(1));
          System.out.println("Output from procedure:\t" + cstmt.getInt("AddressCount"));

          cstmt.close();

sp_sproc_columns gets called on accessing stored procedures with index.

@brettwooldridge
Copy link
Copy Markdown
Contributor

@v-nisidh @v-suhame @marschall, @sehrope @gordthompson

I hate to throw cold-water on any contribution, and in particular I am a performance hawk, so this kind of change generally gets my support, but in this case I question the value.

The cost of an additional roundtrip is negligible for a single execution. If the goal is to eliminate the overhead for multiple executions, I would suggest that addressing this as part of a larger metadata caching scheme (see 166) is a better way to go.

Having said that, if the other reviews feel this is still a desirable change...

Everybody, together, repeat this 3 times:

System properties that affect driver behavior are inherently evil.

System properties stand in the way of deployment within containers (J2EE, OSGi, ...) due to their global (JVM-wide) nature. For example, web archives (WARs) typically define database driver configuration on a per-WAR basis.

Additionally, from a maintenance perspective having all of the configuration in one place, eg. a properties file, XML file, etc., is pretty important.

Any setting that affects driver behavior should be exposed as both a JDBC URL parameter and as a getter/setter on the DataSource class.

@marschall
Copy link
Copy Markdown
Contributor

@brettwooldridge I agree with regards to system properties / environment variables. I would prefer for the existing connection properties mechanism to be used.

@sehrope
Copy link
Copy Markdown
Contributor

sehrope commented Apr 27, 2017

Having something like getSystemEnvOrProperty(...) buried inside internal methods isn't a good idea. If you do read System properties or environment variables, it should be done once at connection creation ... and +infinity to @brettwodridge's comment you really shouldn't be using global properties at all anyway. Everything should be done, and configurable, at the connection level. That's what connection properties are for.

Regarding the breaking change itself, it'd be very confusing behavior to have SQL that works in some clients but not others. In my book that's a non-starter as rule number one for any database driver is to have a consistent interface to execute commands that are accepted by that database.

Having connection properties like this also leads to cargo cults of "Always set com.mssql.fast=true!", which will lead to people enabling it in places they don't understand, which will lead to out of order usages (like the example up thread) breaking.

So I don't think this warrants a breaking change. Plus, again per @brettwooldridge, it's optimizing a single execution where the real value would be caching across repeated batches. If there's a way to do it in the bulk case without breaking existing code it'd have value, but as it stands now, it's not worth it.

@v-nisidh
Copy link
Copy Markdown
Contributor Author

v-nisidh commented Apr 27, 2017

@sehrope, @brettwooldridge I completely agree with your comments related to getSystemProperties() although it is a normal practice in webserver world. As this is a breaking change we want to publish behavior and get suggestions from all contributors.

@brettwooldridge : For long term solution, metadata caching and statement caching are candidates and we are working on them.

FYI: @marschall, @gordthompson

@TobiasSQL
Copy link
Copy Markdown
Contributor

@v-nisidh , @brettwooldridge et al, I wonder why the driver needs this functionality at all. I am probably missing something but why can't the driver just pass parameters by name/index to the server simply based on what the developer uses. In this case we don't need to make the extra call at all.

@sehrope
Copy link
Copy Markdown
Contributor

sehrope commented Apr 27, 2017

I had the same thought as well (i.e. Why is this necessary?). I'm not intimately familiar with this piece of the driver but I think the main requirement for the procedure metadata is to convert from the generic JDBC procedure syntax to SQL Server specific syntax while allowing the use of setXYZ(String name, XYZ value). The setXYZ(...) methods use findColumn(...), which does the metadata lookup, to map from named parameters to positional ones.

There's an example of this usage on the SQL Server website: https://docs.microsoft.com/en-us/sql/connect/jdbc/using-a-stored-procedure-with-output-parameters

I've personally never used this feature (I always stick to native SQL for executing stored procs...) but removing it would break all usage of the CallableStatement.setXYZ(String name, XYZ value).

@TobiasSQL
Copy link
Copy Markdown
Contributor

@sehrope , exactly :-) And SQL Server supports both pass by index and name. So not sure why we would need to figure this out, just convert syntax from ODBC style to SQL Server style and pass by name/index depending on what caller used.

@Suraiya-Hameed
Copy link
Copy Markdown
Contributor

Existing driver behaviour: call to the stored procedure is encapsulated by sp_executesql, ie..,

conn.prepareCall("{call spName (..)}");

this API call is internally executed as,

exec sp_executesql N'EXEC spName ...'

This was the reason to have an extra call to sp_sproc_columns to get metadata, and map parameters by name/index.

I would suggest calling the stored procedure directly (without encapsulating it in sp_executesql), this way it would be possible to pass parameters by name and index.

@TobiasSQL
Copy link
Copy Markdown
Contributor

@v-suhame, exactly :-)

@gordthompson
Copy link
Copy Markdown
Contributor

gordthompson commented Apr 28, 2017

@v-suhame

I would suggest calling the stored procedure directly

This gets my vote, too. It also seems to be the way that both jTDS and .Net SqlClient operate. For example, when I do this in C# ...

using (var cmd = new SqlCommand())
{
    cmd.Connection = conn;
    cmd.CommandType = System.Data.CommandType.StoredProcedure;
    cmd.CommandText = "dbo.MenuPlanner";
    cmd.Parameters.AddWithValue("food", "bacon");
    cmd.Parameters.AddWithValue("name", "Gord");
    Console.WriteLine(cmd.ExecuteScalar());
}

... I see this in SQL Profiler:

RPC:Completed   exec dbo.MenuPlanner @food=N'bacon',@name=N'Gord'

@sehrope
Copy link
Copy Markdown
Contributor

sehrope commented Apr 29, 2017

Skimming through this again ... yes this whole thing makes no sense if the database server supports both named and numbered parameters. Just execute it directly based on either the positional or named parameters.

The oddball case would be supporting combinations of both positional and named parameters (third example below):

-- Stored proc definition and invocation styles

CREATE PROCEDURE some_proc(@a int, @b int)
AS
BEGIN
  SELECT
    @a AS a,
    @b AS b
END

-- 1) Positional only:
EXEC some_proc 1, 2

-- 2) Named only:
EXEC some_proc @a=1, @b=2

-- 3) Positional followed by named:
EXEC some_proc 1, @b=2
// JDBC usage

// 1) Positional only:
stmt.setInt(1, 1);
stmt.setInt(2, 2);

// 2) Named only:
stmt.setInt("a", 1);
stmt.setInt("b", 2);

// 3) Positional followed by named:
stmt.setInt(1, 1);
stmt.setInt("b", 2);

That should be possible as well by allowing both index and String based setXYZ(...) methods on the CallableStatement and applying named parameters after any indexed ones. They could even be done out of order so long as at execution time the driver would check that if any indexed parameters are specified, they should cover a closed set from one through the number of indexed parameters (ex: three indexed parameters should cover [1,2,3]).

@brettwooldridge
Copy link
Copy Markdown
Contributor

@sehrope In the case with named and indexed parameters, if there is a restriction that the index parameters must cover a closed set, I don't think that is going to fly. Unless I misunderstood, that would preclude:

stmt.setInt(1, 1);
stmt.setInt("b", 2);
stmt.setInt(3, 3);

Is that correct? If so, such a restriction would violate the JDBC specification.

@sehrope
Copy link
Copy Markdown
Contributor

sehrope commented Apr 29, 2017

@brettwooldridge Yes that's correct. If the JDBC spec allows for arbitrarily mixing named and numbered parameters in any order, I'm not sure how that would be possible with SQL Server. The stored procedure execution syntax doesn't allow for positional parameters after named ones. They have to be at the start of the parameter list:

-- Valid:
EXEC pr_some_proc 1, 2, @c = 3

-- Invalid:
EXEC pr_some_proc 1, @b = 2, 3

Another alternative would be to blindly save the parameter binds and process them en masse when the statement is executed. At that point if the positional parameters don't cover the [1..N] set and there are named parameters as well, then it would fall back to a metadata lookup to replace the higher numbered positional parameters with named ones.

In the more common case where only one types is used or the positional parameters cover the [1..N] set, no metadata lookup would be necessary.

While I'm all for JDBC spec compliant and backwards compatible, I'm not sure if the added complexity is worth it to cover your out of order example though. Mixing parameter bind types out of order seems "weird" and I can't think of a legit use case for it.

@brettwooldridge
Copy link
Copy Markdown
Contributor

@sehrope I do think that 99.9% of the time either all parameters are going to be either positional or name-based.

It is not a question of whether there are "legit" uses for it, it is a question of whether such cases already exist in the wild.

I can easily envision a scenario with one developer who prefers "by name" for readability, and another who prefers "by index", working in the same code-base and ending up with such a mixed case in a piece of code. I am nearly certain that if I could "grep" Github alone I would find examples of it, not even counting the several billion lines of private MS SQL Java code out there.

Basically, you don't want to be the guy who made the decision that breaks a major banking or telecom application when they upgrade to the v6 driver, do you?

If such a decision was taken at the outset and documented, nobody should be suprised. But we are talking about a conscious decision to violate JDBC compliance, as well as a conscious decision to introduce a regression.

The two choices, as I see it are:

  • Queue up the parameter binds, and at execution time only optimize for the "all index" or "all name" case, otherwise get the metadata (the code currently exists). Or...
  • Do nothing and await the metadata caching solution that various ppl including myself are looking at.

@sehrope
Copy link
Copy Markdown
Contributor

sehrope commented Apr 29, 2017

It is not a question of whether there are "legit" uses for it, it is a question of whether such cases already exist in the wild.

I can easily envision a scenario with one developer who prefers "by name" for readability, and another who prefers "by index", working in the same code-base and ending up with such a mixed case in a piece of code. I am nearly certain that if I could "grep" Github alone I would find examples of it, not even counting the several billion lines of private MS SQL Java code out there.

I really doubt that. Again, it's not just mixing the two together, it's mixing them together out of order (e..g setting the second parameter by index but the first by name). That's why it's so weird.

It also gives you more odd edge cases like what if I set a parameter by index, but then specify the same parameter by name as well. Does last write win?

Basically, you don't want to be the guy who made the decision that breaks a major banking or telecom application when they upgrade to the v6 driver, do you?

The person that upgrades their project's JDBC driver without reading the change log or performing their own testing that would be responsible for that. Plus while both types are currently supported, I don't see that officially documented anywhere (see next section...).

If such a decision was taken at the outset and documented, nobody should be suprised. But we are talking about a conscious decision to violate JDBC compliance, as well as a conscious decision to introduce a regression.

I took a look at the JDBC 4.2 spec (http://download.oracle.com/otn-pub/jcp/jdbc-4_2-mrel2-spec/jdbc4.2-fr-spec.pdf) and found this:

13.3 CallableStatement
...
13.3.2 Setting Parameters
...
It is not possible to combine setting parameters with ordinals and with names in the same statement. If ordinals and names are used for parameters in the same statement, an SQLException is thrown.

... so technically the driver is being non-compliant by allowing the user to use both types of parameter binding.

It'd still be a regression to change it as it's modifying existing behavior. But it'd make the driver more compliant with the JDBC spec.

@brettwooldridge
Copy link
Copy Markdown
Contributor

@sehrope You're right. That's what I get for writing in the middle of the night. I was thinking about ResultSet ordinal vs. named parameter access.

Move along, nothing to see here...

@msftclas
Copy link
Copy Markdown

@v-nisidh, thanks for signing the contribution license agreement. We will now validate the agreement and then the pull request.

Thanks, Microsoft Pull Request Bot

@ajlam
Copy link
Copy Markdown
Member

ajlam commented Jul 17, 2017

Will need to investigate a different implementation. Closing this PR for now.

@ajlam ajlam closed this Jul 17, 2017
@AfsanehR-zz
Copy link
Copy Markdown
Contributor

Hi @sehrope @brettwooldridge @v-nisidh @gordthompson @marschall would appreciate your feedback on the new pr #547 which addresses the issue related to the extra metadata lookup for named parameters. Thanks!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.