diff --git a/docs/en/openmldb_sql/index.rst b/docs/en/openmldb_sql/index.rst
new file mode 100644
index 00000000000..6380c333c9d
--- /dev/null
+++ b/docs/en/openmldb_sql/index.rst
@@ -0,0 +1,18 @@
+=============================
+OpenMLDB SQL
+=============================
+
+
+.. toctree::
+ :maxdepth: 1
+
+ sql_difference
+ language_structure/index
+ data_types/index
+ functions_and_operators/index
+ dql/index
+ dml/index
+ ddl/index
+ deployment_manage/index
+ task_manage/index
+ udf_develop_guide
diff --git a/docs/en/openmldb_sql/sql_difference.md b/docs/en/openmldb_sql/sql_difference.md
new file mode 100644
index 00000000000..feee7e6a9c4
--- /dev/null
+++ b/docs/en/openmldb_sql/sql_difference.md
@@ -0,0 +1,266 @@
+# Main Differences from Standard SQL
+
+This article provides a comparison between the main usage of OpenMLDB SQL (SELECT query statements) and standard SQL (using MySQL-supported syntax as an example). It aims to help developers with SQL experience quickly adapt to OpenMLDB SQL.
+
+Unless otherwise specified, the default version is OpenMLDB: >= v0.7.1
+
+## Support Overview
+
+The table below summarizes the differences in overall performance between OpenMLDB SQL and standard SQL based on SELECT statement elements across three execution modes (for execution mode details, please refer to [Workflow and Execution Modes](../quickstart/concepts/modes.md)). OpenMLDB SQL is currently partially compatible with standard SQL, with additional syntax introduced to accommodate specific business scenarios. New syntax is indicated in bold in the table.
+
+Note: ✓ indicates that the statement is supported, while ✕ indicates that it is not.
+
+| | **OpenMLDB SQL**
**Offline Mode** | **OpenMLDB SQL**
**Online Preview Mode** | **OpenMLDB SQL**
**Online Request Mode** | **Standard SQL** | **Remarks** |
+| -------------- | ---------------------------- | -------------------------------- | -------------------------------- | ------------ | ------------------------------------------------------------ |
+| WHERE Clause | ✓ | ✓ | ✕ | ✓ | Some functionalities can be achieved through built-in functions with the `_where` suffix. |
+| HAVING Clause | ✓ | ✓ | X | ✓ | |
+| JOIN Clause | ✓ | ✕ | ✓ | ✓ | OpenMLDB only supports **LAST JOIN** and **LEFT JOIN**. |
+| GROUP BY | ✓ | ✕ | ✕ | ✓ | |
+| ORDER BY | ✓ | ✓ | ✓ | ✓ | Support is limited to usage within the `WINDOW` and `LAST JOIN` clauses; it does not support reverse sorting in `DESC`. |
+| LIMIT | ✓ | ✓ | ✕ | ✓ | |
+| WINDOW Clause | ✓ | ✓ | ✓ | ✓ | OpenMLDB includes new syntax **WINDOW UNION** and **WINDOW ATTRIBUTES**. |
+| WITH Clause | ✕ | ✕ | ✕ | ✓ | OpenMLDB supports begins from version v0.8.0. |
+| Aggregate Function | ✓ | ✓ | ✓ | ✓ | OpenMLDB has more extension functions. |
+
+
+
+## Explanation of Differences
+
+### Difference Dimension
+
+Compared to standard SQL, the differences in OpenMLDB SQL can be explained from three main perspectives:
+
+1. **Execution Mode**: OpenMLDB SQL has varying support for different SQL statements in three distinct execution modes: offline mode, online preview mode, and online request mode. The choice of execution mode depends on specific requirements. In general, for real-time computations in SQL, business SQL must adhere to the constraints of the online request mode.
+2. **Clause Combinations**: The combination of different clauses can introduce additional limitations. In these scenarios, one clause operates on the result set of another clause. For example, when LIMIT is applied to WHERE, the SQL would resemble `SELECT * FROM (SELECT * FROM t1 WHERE id >= 2) LIMIT 2`. The term 'table reference' used here refers to `FROM TableRef`, which does not represent a subquery or a complex FROM clause involving JOIN or UNION.
+3. **Special Restrictions**: Unique restrictions that do not fit the previous categories are explained separately. These restrictions are usually due to incomplete functionality or known program issues.
+
+### Configuration of Scanning Limits
+
+To prevent user errors from affecting online performance, OpenMLDB has introduced relevant parameters that limit the number of full table scans in offline mode and online preview mode. If these limitations are enabled, certain operations involving scans of multiple records (such as SELECT *, aggregation operations, etc.) may result in truncated results and, consequently, incorrect outcomes. It's essential to note that these parameters do not affect the accuracy of results in online request mode.
+
+The configuration of these parameters is done within the tablet configuration file `conf/tablet.flags`, as detailed in the document on [Configuration File](../deploy/conf.md#the-configuration-file-for-tablet-conftabletflags). The parameters affecting scan limits include:
+
+- Maximum Number of Scans: `--max_traverse_cnt`
+- Maximum Number of Scanned Keys: `--max_traverse_pk_cnt`
+- Size Limit for Returned Results: `--scan_max_bytes_size`
+
+In versions from v0.7.3 onwards, it's expected that the default values for these parameters will be set to 0, implying there will be no related restrictions. Users of earlier versions should take note of the parameter settings.
+
+### WHERE Clause
+
+| **Apply To** | **Offline Mode** | **Online Preview Mode** | **Online Request Mode** |
+| ------------------ | ------------ | ---------------- | ---------------- |
+| Table References | ✓ | ✓ | ✕ |
+| LAST JOIN | ✓ | ✓ | ✕ |
+| Subquery/ WITH Clause | ✓ | ✓ | ✕ |
+
+In the online request mode, the `WHERE` clause isn't supported. However, some functionalities can be achieved through computation functions with the `_where` suffix, like `count_where` and `avg_where`, among others. For detailed information, please refer to [Built-In Functions](./udfs_8h.md).
+
+### LIMIT Clause
+
+LIMIT is followed by an INT literal, and it does not support other expressions. It indicates the maximum number of rows for returned data. However, LIMIT is not supported in the online mode.
+
+| **Apply to** | **Offline Mode** | **Online Preview Mode** | **Online Request Mode** |
+| ----------------- | ---------------- | ----------------------- | ----------------------- |
+| Table Reference | ✓ | ✓ | ✕ |
+| WHERE | ✓ | ✓ | ✕ |
+| WINDOW | ✓ | ✓ | ✕ |
+| LAST JOIN | ✓ | ✓ | ✕ |
+| GROUP BY & HAVING | ✕ | ✓ | ✕ |
+
+### WINDOW Clause
+
+The WINDOW clause and the GROUP BY & HAVING clause cannot be used simultaneously. When transitioning to the online mode, the input table for the WINDOW clause must be either a physical table or a simple column filtering, along with LAST JOIN concatenation of the physical table. Simple column filtering entails a select list containing only column references or renaming columns, without additional expressions. You can refer to the table below for specific support scenarios. If a scenario is not listed, it means that it's not supported.
+
+| **Apply to** | **Offline Mode** | **Online Preview Mode** | **Online Request Mode** |
+| ------------------------------------------------------------ | ---------------- | ----------------------- | ----------------------- |
+| Table Reference | ✓ | ✓ | ✓ |
+| GROUP BY & HAVING | ✕ | ✕ | ✕ |
+| LAST JOIN | ✓ | ✓ | ✓ |
+| Subqueries are only allowed under these conditions:
1. Simple column filtering from a single table
2. Multi-table LAST JOIN
3. Simple column filtering after a dual-table LAST JOIN
| ✓ | ✓ | ✓ |
+
+Special Restrictions:
+
+- In online request mode, the input for WINDOW can be a LAST JOIN or a LAST JOIN within a subquery. It's important to note that the columns for `PARTITION BY` and `ORDER BY` in the window definition must all originate from the leftmost table of the JOIN.
+
+### GROUP BY & HAVING Clause
+
+The GROUP BY statement is still considered an experimental feature and only supports a physical table as the input table. It's not supported in other scenarios. GROUP BY is also not available in the online mode.
+
+| **Apply to** | **Offline Mode** | **Online Preview Mode** | **Online Request Mode** |
+| --------------- | ---------------- | ----------------------- | ----------------------- |
+| Table Reference | ✓ | ✓ | ✕ |
+| WHERE | ✕ | ✕ | ✕ |
+| LAST JOIN | ✕ | ✕ | ✕ |
+| Subquery | ✕ | ✕ | ✕ |
+
+### JOIN Clause
+
+OpenMLDB exclusively supports the LAST JOIN and LEFT JOIN syntax. For a detailed description, please refer to the section on JOIN in the extended syntax. A JOIN consists of two inputs, the left and right. In the online request mode, it supports two inputs as physical tables or specific subqueries. You can refer to the table for specific details. If a scenario is not listed, it means it's not supported.
+
+| **Apply to** | **Offline Mode** | **Online Preview Mode** | **Online Request Mode** |
+| ---------------------------------------------- | ------------ | ---------------- | ---------------- |
+| LAST JOIN + two table reference | ✓ | ✕ | ✓ |
+| LAST JOIN + simple column filtering for both tables| ✓ | ✕ | ✓ |
+| LAST JOIN + left table is filtering with WHERE | ✓ | ✕ | ✓ |
+| LAST JOIN one of the table is WINDOW or LAST JOIN | ✓ | ✕ | ✓ |
+| LAST JOIN + right table is LEFT JOIN subquery | ✕ | ✕ | ✓ |
+| LEFT JOIN | ✕ | ✕ | ✕ |
+
+Special Restrictions:
+- Launching LAST JOIN for specific subqueries involves additional requirements. For more information, please refer to [Online Requirements](../openmldb_sql/deployment_manage/ONLINE_REQUEST_REQUIREMENTS.md#specifications-of-last-join-under-online-request-mode).
+- LAST JOIN and LEFT JOIN is currently not supported in online preview mode.
+
+### WITH Clause
+
+OpenMLDB (>= v0.7.2) supports non-recursive WITH clauses. The WITH clause functions equivalently to how other clauses work when applied to subqueries. To understand how the WITH statement is supported, please refer to its corresponding subquery writing methods as explained in the table above.
+
+No special restrictions apply in this case.
+
+### ORDER BY Keyword
+
+The sorting keyword `ORDER BY` is only supported within the `WINDOW` and `LAST JOIN` clauses in the window definition, and the reverse sorting keyword `DESC` is not supported. Detailed guidance on these clauses can be found in the WINDOW and LAST JOIN sections.
+
+### Aggregate Function
+
+Aggregation functions can be applied to all tables or windows. Window aggregation queries are supported in all three modes. Full table aggregation queries are only supported in online preview mode and are not available in offline and online request modes.
+
+- Regarding full table aggregation, OpenMLDB v0.6.0 began supporting this feature in online preview mode. However, it's essential to pay attention to the described [Scanning Limit Configuration](https://openmldb.feishu.cn/wiki/wikcnhBl4NsKcAX6BO9NDtKAxDf#doxcnLWICKzccMuPiWwdpVjSaIe).
+
+- OpenMLDB offers various extensions for aggregation functions. To find the specific functions supported, please consult the product documentation in [OpenMLDB Built-In Function](../openmldb_sql/udfs_8h.md).
+
+## Extended Syntax
+
+OpenMLDB has focused on deep customization of the `WINDOW` and `LAST JOIN` statements and this section will provide an in-depth explanation of these two statements.
+
+### WINDOW Clause
+
+A typical WINDOW statement in OpenMLDB generally includes the following elements:
+
+- Data Definition: Defines the data within the window using `PARTITION BY`.
+- Data Sorting: Defines the data sorting within the window using `ORDER BY`.
+- Scope Definition: Determines the direction of time extension through `PRECEDING`, `CURRENT ROW`, and `UNBOUNDED`.
+- Range Unit: Utilizes `ROWS` and `ROWS_RANGE` to specify the unit of window sliding range.
+- Window Attributes: Includes OpenMLDB-specific window attribute definitions, such as `MAXSIZE`, `EXCLUDE CURRENT_ROW`, `EXCLUDE CURRENT_TIME`, and `INSTANCE_NOT_IN_WINDOW`.
+- Multi-table Definition: Uses the extended syntax `WINDOW ... UNION` to determine whether concatenation of cross-table data sources is required.
+
+For a detailed syntax of the WINDOW statement, please refer to the [WINDOW Documentation](../openmldb_sql/dql/WINDOW_CLAUSE.md)
+
+| **Statement Element** | **Support Syntax** | **Description** | Required? |
+| ---------------------------------- | ------------------------------------------------------------ | ------------------------------------------------------------ | --------- |
+| Data Definition | PARTITION BY | OpenMLDB supports multiple column data types: bool, int16, int32, int64, string, date, timestamp. | ✓ |
+| Data Sorting | ORDER BY | - It only supports sorting on a single column.
- Supported data types for sorting include int16, int32, int64, and timestamp.
- Reverse order (`DESC`) is not supported.
- Must specify for versions before v0.8.4 | - |
+| Scope Definition | Basic upper and lower bounds definition: ROWS/ROWS_RANGE BETWEEN ... AND ... Scope definition is supported with keywords PRECEDING, OPEN PRECEDING, CURRENT ROW, UNBOUNDED | - Must specify both upper and lower boundaries.
- The boundary keyword `FOLLOWING` is not supported.
- In online request mode, `CURRENT ROW` represents the present request line. From a table perspective, the current row is virtually inserted into the appropriate position in the table based on the `ORDER BY` criteria. | ✓ |
+| Scope Unit | ROWS
ROWS_RANGE (Extended) | - ROW_RANGE is an extended syntax for defining window boundaries similar to standard SQL RANGE-type windows. It allows defining window boundaries with either numerical values or values with time units. This is an extended syntax.
- Window ranges defined in time units are equivalent to window definitions where time is converted into milliseconds. For example, `ROWS_RANGE 10s PRECEDING ...` and `ROWS_RANGE 10000 PRECEDING...` are equivalent. | ✓ |
+| Window Properties (Extended) | MAXSIZE
EXCLUDE CURRENT_ROW
EXCLUDE CURRENT_TIME
INSTANCE_NOT_IN_WINDOW | MAXSIZE is only valid to ROWS_RANGE Without ORDER BY and EXCLUDE CURRENT_TIME cannot be used together | - |
+| Multi Table Definition (Extension) | In practical use, the syntax form is relatively complex. Please refer to:
[Cross Table Feature Development Tutorial](../tutorial/tutorial_sql_2.md)
[WINDOW UNION Syntax Documentation](../openmldb_sql/dql/WINDOW_CLAUSE.md#1-window--union) | - Merging of multiple tables is allowed
- Union of simple subqueries is allowed
- It is commonly used in combination with aggregation functions for cross-table aggregation operations. | - |
+| Incognito Window | - | Complete window definition must include `PARTITION BY`, `ORDER BY`, and window range definition. | - |
+
+#### Special Restrictions
+
+In online preview mode or offline mode, there are certain known issues when using LIMIT or WHERE clauses as inputs to the WINDOW clause, and it's generally not recommended.
+
+#### Example of Window Definition
+
+Define a `ROWS` type window with a range from the first 1000 rows to the current row.
+
+```SQL
+SELECT
+ sum(col2) OVER w1 as w1_col2_sum
+FROM
+ t1WINDOW w1 AS (
+ PARTITION BY col1
+ ORDER BY
+ col5 ROWS BETWEEN 1000 PRECEDING
+ AND CURRENT ROW
+ );
+```
+
+Define a `ROWS_RANGE` type window with a range covering all rows in the first 10 seconds of the current row, including the current row.
+
+```SQL
+SELECT
+ sum(col2) OVER w1 as w1_col2_sum
+FROM
+ t1WINDOW w1 AS (
+ PARTITION BY col1
+ ORDER BY
+ col5 ROWS_RANGE BETWEEN 10s PRECEDING
+ AND CURRENT ROW
+ );
+```
+
+Define a `ROWS` type window with a range from the first 1000 rows to the current row, containing only the current row and no other data at the current time.
+
+```SQL
+SELECT
+ sum(col2) OVER w1 as w1_col2_sum
+FROM
+ t1 WINDOW w1 AS (
+ PARTITION BY col1
+ ORDER BY
+ col5 ROWS BETWEEN 1000 PRECEDING
+ AND CURRENT ROW EXCLUDE CURRENT_TIME
+ );
+```
+
+Define a `ROWS_RANGE` type window with a range from the current time to the past 10 seconds, excluding the current request line.
+
+```SQL
+SELECT
+ sum(col2) OVER w1 as w1_col2_sum
+FROM
+ t1 WINDOW w1 AS (
+ PARTITION BY col1
+ ORDER BY
+ col5 ROWS_RANGE BETWEEN 10s PRECEDING
+ AND CURRENT ROW EXCLUDE CURRENT_ROW
+ );
+```
+
+Anonymous window:
+
+```SQL
+SELECT
+ id,
+ pk1,
+ col1,
+ std_ts,
+ sum(col1) OVER (
+ PARTITION BY pk1
+ ORDER BY
+ std_ts ROWS BETWEEN 1 PRECEDING
+ AND CURRENT ROW
+ ) as w1_col1_sumfrom t1;
+```
+
+#### Example of WINDOW ... UNION
+
+In practical development, many applications store data in multiple tables. In such cases, the syntax `WINDOW ... UNION` is commonly used for cross-table aggregation operations. Please refer to the "Multi-Table Aggregation Features" section in the [Cross-Table Feature Development Tutorial](../tutorial/tutorial_sql_2.md).
+
+### LAST JOIN Clause
+
+For detailed syntax specifications for LAST JOIN, please refer to the [LAST JOIN Documentation](../openmldb_sql/dql/JOIN_CLAUSE.md#join-clause).
+
+| **Statement Element** | **Support Syntax** | **Description** | Required? |
+| --------------------- | ------------------ | ------------------------------------------------------------ | --------- |
+| ON | ✓ | Supported column types include: BOOL, INT16, INT32, INT64, STRING, DATE, TIMESTAMP. | ✓ |
+| USING | X | - | - |
+| ORDER BY | ✓ | - LAST JOIN extended syntax, not supported by LEFT JOIN.
- Only the following column types can be used: INT16, INT32, INT64, TIMESTAMP.
- The reverse order keyword DESC is not supported. | - |
+
+#### Example of LAST JOIN
+
+```SQL
+SELECT
+ *
+FROM
+ t1
+LAST JOIN t2 ON t1.col1 = t2.col1;
+
+SELECT
+ *
+FROM
+ t1
+LEFT JOIN t2 ON t1.col1 = t2.col1;
+```
+
diff --git a/docs/en/openmldb_sql/udf_develop_guide.md b/docs/en/openmldb_sql/udf_develop_guide.md
new file mode 100644
index 00000000000..1a2d73335a8
--- /dev/null
+++ b/docs/en/openmldb_sql/udf_develop_guide.md
@@ -0,0 +1,230 @@
+# UDF Development Guideline
+## Background
+Although OpenMLDB provides over a hundred built-in functions for data scientists to perform data analysis and feature extraction, there are scenarios where these functions might not fully meet the requirements. To facilitate users in quickly and flexibly implementing specific feature computation needs, we have introduced support for user-defined functions (UDFs) based on C++ development. Additionally, we enable the loading of dynamically generated user-defined function libraries.
+
+```{seealso}
+Users can also extend OpenMLDB's computation function library using the method of developing built-in functions. However, developing built-in functions requires modifying the source code and recompiling. If users wish to contribute extended functions to the OpenMLDB codebase, they can refer to [Built-in Function Develop Guide](./built_in_function_develop_guide.md).
+```
+
+## Development Procedures
+### Develop UDF functions
+#### Naming Convention of C++ Built-in Function
+- The naming of C++ built-in function should follow the [snake_case](https://en.wikipedia.org/wiki/Snake_case) style.
+- The name should clearly express the function's purpose.
+- The name of a function should not be the same as the name of a built-in function or other custom functions. The list of all built-in functions can be seen [here](../openmldb_sql/udfs_8h.md).
+
+#### C++ Type and SQL Type Correlation
+The types of the built-in C++ functions' parameters should be BOOL, NUMBER, TIMESTAMP, DATE, or STRING.
+The SQL types corresponding to C++ types are shown as follows:
+
+| SQL Type | C/C++ Type |
+|:----------|:------------|
+| BOOL | `bool` |
+| SMALLINT | `int16_t` |
+| INT | `int32_t` |
+| BIGINT | `int64_t` |
+| FLOAT | `float` |
+| DOUBLE | `double` |
+| STRING | `StringRef` |
+| TIMESTAMP | `Timestamp` |
+| DATE | `Date` |
+
+
+#### Parameters and Return Values
+
+**Return Value**:
+
+* If the output type of the UDF is a basic type and `return_nullable` set to false, it will be processed as a return value.
+* If the output type of the UDF is a basic type and `return_nullable` set to true, it will be processed as a function parameter.
+* If the output type of the UDF is STRING, TIMESTAMP or DATE, it will return through the **last parameter** of the function.
+
+**Parameters**:
+
+* If the parameter is a basic type, it will be passed by value.
+* If the output type of the UDF is STRING, TIMESTAMP or DATE, it will be passed by a pointer.
+* The first parameter must be `UDFContext* ctx`. The definition of [UDFContext](../../../include/udf/openmldb_udf.h) is:
+
+```c++
+ struct UDFContext {
+ ByteMemoryPool* pool; // Used for memory allocation.
+ void* ptr; // Used for the storage of temporary variables for aggregate functions.
+ };
+```
+
+**Function Declaration**:
+
+* The functions must be declared by extern "C".
+
+#### Memory Management
+
+- In scalar functions, the use of 'new' and 'malloc' to allocate space for input and output parameters is not allowed. However, temporary space allocation using 'new' and 'malloc' is permissible within the function, and the allocated space must be freed before the function returns.
+
+- In aggregate functions, space allocation using 'new' or 'malloc' can be performed in the 'init' function but must be released in the 'output' function. The final return value, if it is a string, needs to be stored in the space allocated by mempool.
+
+- If dynamic memory allocation is required, OpenMLDB provides memory management interfaces. Upon function execution completion, OpenMLDB will automatically release the memory.
+```c++
+char *buffer = ctx->pool->Alloc(size);
+```
+- The maximum size allocated at once cannot exceed 2M.
+
+**Note**:
+- If the parameters are declared as nullable, then all parameters are nullable, and each input parameter will have an additional `is_null` parameter.
+- If the return value is declared as nullable, it will be returned through parameters, and an additional `is_null` parameter will indicate whether the return value is null.
+
+For instance, to declare a UDF scalar function, sum, which has two parameters, if the input and return value are nullable:
+```c++
+extern "C"
+void sum(::openmldb::base::UDFContext* ctx, int64_t input1, bool is_null, int64_t input2, bool is_null, int64_t* output, bool* is_null) {
+```
+#### Scalar Function Implementation
+
+Scalar functions process individual data rows and return a single value, such as abs, sin, cos, date, year.
+The process is as follows:
+- The head file `udf/openmldb_udf.h` should be included.
+- Develop the logic of the function.
+
+```c++
+#include "udf/openmldb_udf.h" // must include this header file
+
+// Develop a UDF that slices the first 2 characters of a given string.
+extern "C"
+void cut2(::openmldb::base::UDFContext* ctx, ::openmldb::base::StringRef* input, ::openmldb::base::StringRef* output) {
+ if (input == nullptr || output == nullptr) {
+ return;
+ }
+ uint32_t size = input->size_ <= 2 ? input->size_ : 2;
+ //use ctx->pool for memory allocation
+ char *buffer = ctx->pool->Alloc(size);
+ memcpy(buffer, input->data_, size);
+ output->size_ = size;
+ output->data_ = buffer;
+}
+```
+
+
+#### Aggregation Function Implementation
+
+Aggregate functions process a dataset (such as a column of data) and perform computations, returning a single value, such as sum, avg, max, min, count.
+The process is as follows:
+- The head file `udf/openmldb_udf.h` should be included.
+- Develop the logic of the function.
+
+To develop an aggregate function, you need to implement the following three C++ methods:
+
+- init function: Perform initialization tasks such as allocating space for intermediate variables. Function naming format: 'aggregate_function_name_init'.
+
+- update function: Implement the logic for processing each row of the respective field in the update function. Function naming format: 'aggregate_function_name_update'.
+
+- output function: Process the final aggregated value and return the result. Function naming format: 'aggregate_function_name_output'."
+
+**Node**: Return `UDFContext*` as the return value in the init and update function.
+
+```c++
+#include "udf/openmldb_udf.h" //must include this header file
+// implementation of aggregation function special_sum
+extern "C"
+::openmldb::base::UDFContext* special_sum_init(::openmldb::base::UDFContext* ctx) {
+ // allocate space for intermediate variables and assign to 'ptr' in UDFContext.
+ ctx->ptr = ctx->pool->Alloc(sizeof(int64_t));
+ // init the value
+ *(reinterpret_cast(ctx->ptr)) = 10;
+ // return pointer of UDFContext, cannot be omitted
+ return ctx;
+}
+
+extern "C"
+::openmldb::base::UDFContext* special_sum_update(::openmldb::base::UDFContext* ctx, int64_t input) {
+ // get the value from ptr in UDFContext
+ int64_t cur = *(reinterpret_cast(ctx->ptr));
+ cur += input;
+ *(reinterpret_cast(ctx->ptr)) = cur;
+ // return the pointer of UDFContext, cannot be omitted
+ return ctx;
+}
+
+// get the aggregation result from ptr in UDFcontext and return
+extern "C"
+int64_t special_sum_output(::openmldb::base::UDFContext* ctx) {
+ return *(reinterpret_cast(ctx->ptr)) + 5;
+}
+
+```
+
+
+For more UDF implementation, see [here](../../../src/examples/test_udf.cc).
+
+
+### Compile Dynamic Library
+
+- Copy the `include` directory (`https://github.com/4paradigm/OpenMLDB/tree/main/include`) to a certain path (like `/work/OpenMLDB/`) for later compiling.
+- Run the compiling command. `-I` specifies the path of the `include` directory. `-o` specifies the name of the dynamic library.
+
+```shell
+g++ -shared -o libtest_udf.so examples/test_udf.cc -I /work/OpenMLDB/include -std=c++11 -fPIC
+```
+
+### Copy Dynamic Library
+The compiled dynamic libraries should be copied into the `udf` directories for both TaskManager and tablets. Please create a new `udf` directory if it does not exist.
+- The `udf` directory of a tablet is `path_to_tablet/udf`.
+- The `udf` directory of TaskManager is `path_to_taskmanager/taskmanager/bin/udf`.
+
+For example, if the deployment paths of a tablet and TaskManager are both `/work/openmldb`, the structure of the directory is shown below:
+
+```
+ /work/openmldb/
+ ├── bin
+ ├── conf
+ ├── taskmanager
+ │ ├── bin
+ │ │ ├── taskmanager.sh
+ │ │ └── udf
+ │ │ └── libtest_udf.so
+ │ ├── conf
+ │ └── lib
+ ├── tools
+ └── udf
+ └── libtest_udf.so
+```
+
+```{note}
+- For multiple tablets, the library needs to be copied to every tablet.
+- Dynamic libraries should not be deleted before the execution of `DROP FUNCTION`.
+```
+
+
+### Register, Drop and Show the Functions
+For registering, please use [CREATE FUNCTION](../openmldb_sql/ddl/CREATE_FUNCTION.md).
+
+Register an scalar function:
+```sql
+CREATE FUNCTION cut2(x STRING) RETURNS STRING OPTIONS (FILE='libtest_udf.so');
+```
+Register an aggregation function:
+```sql
+CREATE AGGREGATE FUNCTION special_sum(x BIGINT) RETURNS BIGINT OPTIONS (FILE='libtest_udf.so');
+```
+Register an aggregation function with input value and return value support null:
+```sql
+CREATE AGGREGATE FUNCTION third(x BIGINT) RETURNS BIGINT OPTIONS (FILE='libtest_udf.so', ARG_NULLABLE=true, RETURN_NULLABLE=true);
+```
+
+**note**:
+- The types of parameters and return values must be consistent with the implementation of the code.
+- `FILE` specifies the file name of the dynamic library. It is not necessary to include a path.
+- A UDF function can only work on one type. Please create multiple functions for multiple types.
+
+
+After successful registration, the function can be used.
+```sql
+SELECT cut2(c1) FROM t1;
+```
+
+You can view registered functions through `SHOW FUNCTIONS`.
+```sql
+SHOW FUNCTIONS;
+```
+
+Use the `DROP FUNCTION` to delete a registered function.
+```sql
+DROP FUNCTION cut2;
+```
diff --git a/docs/zh/openmldb_sql/udf_develop_guide.md b/docs/zh/openmldb_sql/udf_develop_guide.md
index 761e66dea6f..89771df4b5a 100644
--- a/docs/zh/openmldb_sql/udf_develop_guide.md
+++ b/docs/zh/openmldb_sql/udf_develop_guide.md
@@ -1,18 +1,19 @@
# 自定义函数(UDF)开发
-## 1. 背景
+## 背景
虽然OpenMLDB内置了上百个函数,以供数据科学家作数据分析和特征抽取。但是在某些场景下还是不能很好的满足要求,为了便于用户快速灵活实现特定的特征计算需求,我们支持了基于 C++ 的用户自定义函数(UDF)开发,以及动态用户自定义函数库的加载。
```{seealso}
用户也可以使用内置函数开发的方式扩展 OpenMLDB 的计算函数库。但是内置函数开发需要修改源代码和重新编译。如果用户希望贡献扩展函数到 OpenMLDB 代码库,那么可以参考[内置函数的开发文档](../developer/built_in_function_develop_guide.md)。
```
-## 2. 开发步骤
-### 2.1 开发自定义函数
-#### 2.1.1 C++函数名规范
+## 开发步骤
+### 开发自定义函数
+#### C++函数名规范
- C++内置函数名统一使用[snake_case](https://en.wikipedia.org/wiki/Snake_case)风格
- 要求函数名能清晰表达函数功能
- 函数不能重名。函数名不能和内置函数及其他自定义函数重名。所有内置函数的列表参考[这里](../openmldb_sql/udfs_8h.md)
-#### 2.1.2 C++类型与SQL类型对应关系
+#### C++类型与SQL类型对应关系
+
内置C++函数的参数类型限定为:BOOL类型,数值类型,时间戳日期类型和字符串类型。C++类型SQL类型对应关系如下:
| SQL类型 | C/C++ 类型 |
@@ -26,7 +27,7 @@
| STRING | `StringRef` |
| TIMESTAMP | `Timestamp` |
| DATE | `Date` |
-#### 2.1.3 函数参数和返回值
+#### 函数参数和返回值
返回值:
* 如果udf输出类型是基本类型,并且`return_nullable`设置为false, 则通过函数返回值返回
* 如果udf输出类型是基本类型,并且`return_nullable`设置为true, 则通过函数参数返回
@@ -46,7 +47,7 @@
函数声明:
* 函数必须用extern "C"来声明
-#### 2.1.4 内存管理
+#### 内存管理
- 在单行函数中,不允许使用`new`和`malloc`给输入和输出参数开辟空间。函数内部可以使用`new`和`malloc`申请临时空间, 申请的空间在函数返回前需要释放掉。
- 在聚合函数中,在init函数中可以使用`new`/`malloc`开辟空间,但是必须在output函数中释放。最后的返回值如果是string需要保存在mempool开辟的空间中
@@ -67,7 +68,7 @@ extern "C"
void sum(::openmldb::base::UDFContext* ctx, int64_t input1, bool is_null, int64_t input2, bool is_null, int64_t* output, bool* is_null) {
```
-#### 2.1.5 单行函数开发
+#### 单行函数开发
单行函数(scalar function)对单行数据进行处理,返回单个值,比如 `abs`, `sin`, `cos`, `date`, `year` 等。
@@ -94,7 +95,7 @@ void cut2(::openmldb::base::UDFContext* ctx, ::openmldb::base::StringRef* input,
}
```
-#### 2.1.6 聚合函数开发
+#### 聚合函数开发
聚合函数(aggregate function)对一个数据集(比如一列数据)执行计算,返回单个值,比如 `sum`, `avg`, `max`, `min`, `count` 等。
@@ -144,15 +145,15 @@ int64_t special_sum_output(::openmldb::base::UDFContext* ctx) {
更多udf/udaf实现参考[这里](../../../src/examples/test_udf.cc)。
-### 2.2 编译动态库
+### 编译动态库
- 拷贝include目录 `https://github.com/4paradigm/OpenMLDB/tree/main/include` 到某个路径下,下一步编译会用到。如/work/OpenMLDB/
- 执行编译命令,其中 -I 指定inlcude目录的路径 -o 指定产出动态库的名称
--
+
```shell
g++ -shared -o libtest_udf.so examples/test_udf.cc -I /work/OpenMLDB/include -std=c++17 -fPIC
```
-### 2.3 拷贝动态库
+### 拷贝动态库
编译过的动态库需要被拷贝到 TaskManager 和 tablets中。如果 TaskManager 和 tablets中不存在`udf`目录,请先创建并重启这些进程(保证环境变量生效)。
- tablet的UDF目录是 `path_to_tablet/udf`。
- TaskManager的UDF目录是 `path_to_taskmanager/taskmanager/bin/udf`。
@@ -181,7 +182,7 @@ g++ -shared -o libtest_udf.so examples/test_udf.cc -I /work/OpenMLDB/include -st
- 在执行' DROP FUNCTION '之前请勿删除动态库。
```
-### 2.4 注册、删除和查看函数
+### 注册、删除和查看函数
注册函数使用[CREATE FUNCTION](../openmldb_sql/ddl/CREATE_FUNCTION.md)
注册单行函数