Skip to content

[QUESTION] Exception is not thrown as expected when the session is killed in SQL database #1846

@DorisTao77

Description

@DorisTao77

Question

when using this library in Spark to run the SQL query and get the data into a Spark dataframe, if the session is accidentally killed from SQL database side, there's no reset error thrown as expected, but the count result has indicated that the data is not read completely.

steps to reproduce:

(1) use this library spark to run the SQL query and get the data into a Spark dataframe, code shown as below:

%scala
val token = "<token>"

val jdbcHostname = "xinrandatabseserver.database.windows.net"
val jdbcDatabase = "xinranSQLDatabase"
val jdbcPort = 1433
val jdbcUrl = "jdbc:sqlserver://%s:%s;databaseName=%s;encrypt=true;trustServerCertificate=false;hostNameInCertificate=*.database.windows.net".format(jdbcHostname, jdbcPort, jdbcDatabase)+ ";accessToken="


import java.util.Properties
val connectionProperties = new Properties()
val driverClass = "com.microsoft.sqlserver.jdbc.SQLServerDriver"
connectionProperties.setProperty("Driver", driverClass)
connectionProperties.setProperty("accesstoken", token)

val sql_pushdown = "(select UNITS from payment_balance_new) emp_alias"
val df_stripe_dispute = spark.read.option("connectRetryCount", 200).option("numPartitions",1).jdbc(url=jdbcUrl, table=sql_pushdown, properties=connectionProperties)
df_stripe_dispute.count()

(2) kill the session from the SQL database side by "kill [SPID]"

(3) check the count() result, which is less than the actual one, but no error exception is thrown, the command is executed successfully.

Need help on

In general Spark connects to SqlServer and reads the data using the below code where rs is com.microsoft.sqlserver.jdbc.SQLServerResultSet, so whenever you killed the session rs.next() returned false but didn't throw the exception.

override protected def getNext(): InternalRow = {
if (rs.next()) {
inputMetrics.incRecordsRead(1)
var i = 0
while (i < getters.length) {
getters(i).apply(rs, mutableRow, i)
if (rs.wasNull) mutableRow.setNullAt(i)
i = i + 1
}
mutableRow
} else {
finished = true
null.asInstanceOf[InternalRow]
}
}

Based on the code analysis of com.microsoft.sqlserver.jdbc.SQLServerResultSet.next() there is a possibility that it returns false when the connection is broken, You can refer the code here https://github.com/microsoft/mssql-jdbc/blob/main/src/main/java/com/microsoft/sqlserver/jdbc/SQLServerResultSet.java#L1011-L1107

Could you help check why the exception is not thrown as expected? Thanks!

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions