Skip to content

Commit

Permalink
Merge branch 'main' into feat/limit_put_memory
Browse files Browse the repository at this point in the history
  • Loading branch information
dl239 authored Dec 1, 2023
2 parents b566b7c + d1d2e38 commit f340b2b
Show file tree
Hide file tree
Showing 26 changed files with 351 additions and 245 deletions.
7 changes: 6 additions & 1 deletion CMakeLists.txt
Original file line number Diff line number Diff line change
Expand Up @@ -30,9 +30,14 @@ if (CMAKE_BUILD_TYPE STREQUAL "")
endif ()

if (NOT DEFINED CMAKE_PREFIX_PATH)
set(CMAKE_PREFIX_PATH "${CMAKE_SOURCE_DIR}/.deps/usr")
if (DEFINED ENV{THIRD_PARTY_DIR})
set(CMAKE_PREFIX_PATH $ENV{THIRD_PARTY_DIR})
else()
set(CMAKE_PREFIX_PATH "${CMAKE_SOURCE_DIR}/.deps/usr")
endif()
endif()

message (STATUS "CMAKE_PREFIX_PATH: ${CMAKE_PREFIX_PATH}")
message (STATUS "CMAKE_BUILD_TYPE: ${CMAKE_BUILD_TYPE}")
set(OPENMLDB_VERSION_MAJOR 0)
set(OPENMLDB_VERSION_MINOR 8)
Expand Down
1 change: 1 addition & 0 deletions docs/en/reference/sql/ddl/SET_STATEMENT.md
Original file line number Diff line number Diff line change
Expand Up @@ -34,6 +34,7 @@ The following format is also equivalent.
| @@session.enable_trace|@@enable_trace | When the value is `true`, an error message stack will be printed when the SQL statement has a syntax error or an error occurs during the plan generation process. <br />When the value is `false`, only the basic error message will be printed if there is a SQL syntax error or an error occurs during the plan generation process. | `true`, <br /> `false` | `false` |
| @@session.sync_job|@@sync_job | When the value is `true`, the offline command will be executed synchronously, waiting for the final result of the execution.<br />When the value is `false`, the offline command returns immediately. If you need to check the execution, please use `SHOW JOB` command. | `true`, <br /> `false` | `false` |
| @@session.sync_timeout|@@sync_timeout | When `sync_job=true`, you can configure the waiting time for synchronization commands. The timeout will return immediately. After the timeout returns, you can still view the command execution through `SHOW JOB`. | Int | 20000 |
| @@session.spark_config|@@spark_config | Set the Spark configuration for offline jobs, configure like 'spark.executor.memory=2g;spark.executor.cores=2'. Notice that the priority of this Spark configuration is higer than TaskManager Spark configuration but lower than CLI Spark configuration file. | String | "" |

## Example

Expand Down
18 changes: 13 additions & 5 deletions docs/zh/faq/client_faq.md
Original file line number Diff line number Diff line change
Expand Up @@ -76,13 +76,21 @@ sdk日志(glog日志):

## 离线命令Spark报错

`java.lang.OutOfMemoryError: Java heap space`

离线命令的Spark配置默认为`local[*]`,并发较高可能出现OutOfMemoryError错误,请调整`spark.driver.memory``spark.executor.memory`两个spark配置项。可以写在TaskManager运行目录的`conf/taskmanager.properties``spark.default.conf`并重启TaskManager,或者使用CLI客户端进行配置,参考[客户端Spark配置文件](../reference/client_config/client_spark_config.md)
```
spark.default.conf=spark.driver.memory=16g;spark.executor.memory=16g
java.lang.OutOfMemoryError: Java heap space
```

```
Container killed by YARN for exceeding memory limits. 5 GB of 5 GB physical memory used. Consider boosting spark.yarn.executor.memoryOverhead.
```

出现以上几种日志时,说明离线任务所需资源多于当前配置。一般是这几种情况:
- 离线命令的Spark配置`local[*]`,机器核数较多,并发度很高,资源占用过大
- memory配置较小

如果是local模式,单机资源比较有限,可以考虑降低并发度。如果不降低并发,请调整`spark.driver.memory``spark.executor.memory`两个spark配置项。可以写在TaskManager运行目录的`conf/taskmanager.properties``spark.default.conf`并重启TaskManager,或者使用CLI客户端进行配置,参考[客户端Spark配置文件](../reference/client_config/client_spark_config.md)
```
spark.default.conf=spark.driver.memory=16g;spark.executor.memory=16g
```

local时drivermemory
master为local时,不是调整executor的,而是driver的memory,如果你不确定,可以两者都调节。
2 changes: 1 addition & 1 deletion docs/zh/openmldb_sql/ddl/SET_STATEMENT.md
Original file line number Diff line number Diff line change
Expand Up @@ -35,7 +35,7 @@ sessionVariableName ::= '@@'Identifier | '@@session.'Identifier | '@@global.'Ide
| @@session.enable_trace|@@enable_trace | 当该变量值为 `true`,SQL语句有语法错误或者在计划生成过程发生错误时,会打印错误信息栈。<br />当该变量值为 `false`,SQL语句有语法错误或者在计划生成过程发生错误时,仅打印基本错误信息。 | "true" \| "false" | "false" |
| @@session.sync_job|@@sync_job | 当该变量值为 `true`,离线的命令将变为同步,等待执行的最终结果。<br />当该变量值为 `false`,离线的命令即时返回,若要查看命令的执行情况,请使用`SHOW JOB`| "true" \| "false" | "false" |
| @@session.job_timeout|@@job_timeout | 可配置离线异步命令或离线管理命令的等待时间(以*毫秒*为单位),将立即返回。离线异步命令返回后仍可通过`SHOW JOB`查看命令执行情况。 | Int | "20000" |

| @@session.spark_config|@@spark_config | 设置离线任务的 Spark 参数,配置项参考 'spark.executor.memory=2g;spark.executor.cores=2'。注意此 Spark 配置优先级高于 TaskManager 默认 Spark 配置,低于命令行的 Spark 配置文件。 | String | "" |
## Example

### 设置和显示会话系统变量
Expand Down
4 changes: 2 additions & 2 deletions docs/zh/openmldb_sql/dml/DELETE_STATEMENT.md
Original file line number Diff line number Diff line change
Expand Up @@ -12,7 +12,7 @@ TableName ::=

**说明**

- `DELETE` 语句删除在线表满足指定条件的数据
- `DELETE` 语句删除在线表满足指定条件的数据,删除并不是所有索引中满足条件的数据都被删除,只会删除与where condition相关的索引,示例见[功能边界](../../quickstart/function_boundary.md#delete)
- `WHERE` 指定的筛选列必须是索引列。如果是key列只能用等于

## Examples
Expand All @@ -25,4 +25,4 @@ DELETE FROM t1 WHERE col1 = 'aaaa' and ts_col = 1687145994000;
DELETE FROM t1 WHERE col1 = 'aaaa' and ts_col > 1687059594000 and ts_col < 1687145994000;

DELETE FROM t1 WHERE ts_col > 1687059594000 and ts_col < 1687145994000;
```
```
6 changes: 5 additions & 1 deletion docs/zh/openmldb_sql/dml/LOAD_DATA_STATEMENT.md
Original file line number Diff line number Diff line change
@@ -1,6 +1,10 @@
# LOAD DATA INFILE
`LOAD DATA INFILE`语句能高效地将文件中的数据读取到数据库中的表中。`LOAD DATA INFILE``SELECT INTO OUTFILE`互补。要将数据从 table导出到文件,请使用[SELECT INTO OUTFILE](../dql/SELECT_INTO_STATEMENT.md)。要将文件数据导入到 table 中,请使用`LOAD DATA INFILE`

```{note}
INFILE 的 filePath,既可以是单个文件名,也可以是目录,也可以使用`*`通配符。如果目录中存在多格式的文件,只会选择 LoadDataInfileOptionsList 中指定的FORMAT格式文件。具体格式等价于`DataFrameReader.read.load(String)`,可以使用spark shell来read你想要的文件路径,确认能否读入成功。
```

## Syntax

```sql
Expand Down Expand Up @@ -70,7 +74,7 @@ FilePathPattern
## SQL语句模版

```sql
LOAD DATA INFILE 'file_name' INTO TABLE 'table_name' OPTIONS (key = value, ...);
LOAD DATA INFILE 'file_path' INTO TABLE 'table_name' OPTIONS (key = value, ...);
```

## Hive 支持
Expand Down
6 changes: 3 additions & 3 deletions docs/zh/openmldb_sql/dql/GROUP_BY_CLAUSE.md
Original file line number Diff line number Diff line change
Expand Up @@ -2,9 +2,9 @@

## Syntax

```SQL
GroupByClause
::= 'GROUP' 'BY' ByList
```yacc
group_by_clause:
GROUP BY group_by_specification
```

## SQL语句模版
Expand Down
6 changes: 3 additions & 3 deletions docs/zh/openmldb_sql/dql/HAVING_CLAUSE.md
Original file line number Diff line number Diff line change
Expand Up @@ -4,9 +4,9 @@ Having 子句与 Where 子句作用类似.Having 子句过滤 GroupBy 后的各

## Syntax

```
HavingClause
::= 'HAVING' Expression
```yacc
having_clause
HAVING bool_expression
```

## SQL语句模版
Expand Down
16 changes: 8 additions & 8 deletions docs/zh/openmldb_sql/dql/JOIN_CLAUSE.md
Original file line number Diff line number Diff line change
Expand Up @@ -16,16 +16,16 @@ LAST JOIN 是 OpenMLDB SQL 拓展的 JOIN类型. 它的语法和 LEFT JOIN 基

## Syntax

```
join:
TableRef "LAST" "JOIN" TableRef [OrderByClause] "ON" Expression
| TableRef join_type "JOIN" TableRef "ON" Expression
```yacc
join_operation:
condition_join_operation
join_type:
'LEFT' [OUTER]
condition_join_operation:
from_item LEFT [ OUTER ] JOIN from_item join_condition
| from_item LAST JOIN [ ORDER BY ordering_expression ] from_item join_condition
order_by_clause:
'ORDER' 'BY' <COLUMN_NAME>
join_condition:
ON bool_expression
```

### 使用限制说明
Expand Down
6 changes: 3 additions & 3 deletions docs/zh/openmldb_sql/dql/LIMIT_CLAUSE.md
Original file line number Diff line number Diff line change
Expand Up @@ -4,9 +4,9 @@ Limit子句用于限制返回的结果条数。Limit支持接受一个参数,

## Syntax

```sql
LimitClause
::= 'LIMIT' int_leteral
```yacc
limit_clause:
LIMIT numeric_expression
```

## SQL语句模版
Expand Down
10 changes: 2 additions & 8 deletions docs/zh/openmldb_sql/dql/NO_TABLE_SELECT_CLAUSE.md
Original file line number Diff line number Diff line change
Expand Up @@ -4,14 +4,8 @@

## Syntax

```sql
NoTableSelectClause
::= 'SELECT' SelectExprList
SelectExprList
::= SelectExpr ( ',' SelectExpr )*
SelectExpr ::= ( Identifier '.' ( Identifier '.' )? )? '*'
| ( Expression | '{' Identifier Expression '}' ) ['AS' Identifier]

```yacc
SELECT { expression [ [ AS ] alias ] } [, ...];
```

## SQL语句模版
Expand Down
23 changes: 8 additions & 15 deletions docs/zh/openmldb_sql/dql/SELECT_INTO_STATEMENT.md
Original file line number Diff line number Diff line change
Expand Up @@ -5,21 +5,14 @@
```
## Syntax

```sql
SelectIntoStmt
::= SelectStmt 'INTO' 'OUTFILE' filePath SelectIntoOptionList

filePath
::= string_literal
SelectIntoOptionList
::= 'OPTIONS' '(' SelectInfoOptionItem (',' SelectInfoOptionItem)* ')'

SelectInfoOptionItem
::= 'DELIMITER' '=' string_literal
|'HEADER' '=' bool_literal
|'NULL_VALUE' '=' string_literal
|'FORMAT' '=' string_literal
|'MODE' '=' string_literal
```yacc
select_into_statement:
query INTO OUTFILE string_file_path
[ OPTIONS options_list ]
[ CONFIG options_list ]
options_list:
( { key = value } [, ...] )
```

`SELECT INTO OUTFILE`分为三个部分。
Expand Down
Loading

0 comments on commit f340b2b

Please sign in to comment.