An identifier is a string used to identify a database object such as a table, view, schema, column, etc. Spark SQL has regular identifiers and delimited identifiers, which are enclosed within backticks. Both regular identifiers and delimited identifiers are case-insensitive.
// 修改 select 部分使用 Tb1 会报错,逻辑计划最下面可以看到 spark 将表名转为了小写 // 也就是说 Tb1.id != tb1.id 导致报错 找不到字段 scala> spark.sql("select Tb1.id from Tb1").explain(true) org.apache.spark.sql.AnalysisException: Column 'Tb1.id' does not exist. Did you mean one of the following? [spark_catalog.default.tb1.id]; line 1 pos 7; 'Project ['Tb1.id] +- SubqueryAlias spark_catalog.default.tb1 +- Relationdefault.tb1[id#32] parquet
// 在 Spark 3.2.2 版本中报错信息如下 scala> spark.sql("select Tb1.id from Tb1").show(false) org.apache.spark.sql.AnalysisException: cannot resolve 'Tb1.id' given input columns: [spark_catalog.default.tb1.id]; line 1 pos 7; 'Project ['Tb1.id] +- SubqueryAlias spark_catalog.default.tb1 +- Relationdefault.tb1[id#24] parquet