Tag: pyspark sql

inferSchema使用spark.read.format(“com.crealytics.spark.excel”)推断datetypes列的double

我正在PySpark( Python 3.6和Spark 2.1.1 )上工作,并尝试使用spark.read.format(“com.crealytics.spark.excel”)从excel文件中获取数据,但是推断出datetypes为double柱。 例: input – df = spark.read.format("com.crealytics.spark.excel").\ option("location", "D:\\Users\\ABC\\Desktop\\TmpData\\Input.xlsm").\ option("spark.read.simpleMode","true"). \ option("treatEmptyValuesAsNulls", "true").\ option("addColorColumns", "false").\ option("useHeader", "true").\ option("inferSchema", "true").\ load("com.databricks.spark.csv") 结果: Name | Age | Gender | DateOfApplication ________________________________________ X | 12 | F | 5/20/2015 Y | 15 | F | 5/28/2015 Z | 14 | F | 5/29/2015 打印架构 […]