diff --git a/docs/design/autoware-architecture/perception/image/high-level-perception-diagram.drawio.svg b/docs/design/autoware-architecture/perception/image/high-level-perception-diagram.drawio.svg
new file mode 100644
index 0000000000..c9601c7867
--- /dev/null
+++ b/docs/design/autoware-architecture/perception/image/high-level-perception-diagram.drawio.svg
@@ -0,0 +1,4 @@
+
+
+
+
\ No newline at end of file
diff --git a/docs/design/autoware-architecture/perception/image/reference-implementaion-perception-diagram.drawio.svg b/docs/design/autoware-architecture/perception/image/reference-implementaion-perception-diagram.drawio.svg
new file mode 100644
index 0000000000..8abfd017a0
--- /dev/null
+++ b/docs/design/autoware-architecture/perception/image/reference-implementaion-perception-diagram.drawio.svg
@@ -0,0 +1,4 @@
+
+
+
+
\ No newline at end of file
diff --git a/docs/design/autoware-architecture/perception/index.md b/docs/design/autoware-architecture/perception/index.md
index 867511f13f..101d082d15 100644
--- a/docs/design/autoware-architecture/perception/index.md
+++ b/docs/design/autoware-architecture/perception/index.md
@@ -1,5 +1,96 @@
-# Perception component design
+# Perception Component Design
-!!! warning
+## Purpose of this document
- Under Construction
+This document outlines the high-level design strategies, goals and related rationales in the development of the Perception Component. Through this document, it is expected that all OSS developers will comprehend the design philosophy, goals and constraints under which the Perception Component is designed, and participate seamlessly in the development.
+
+## Overview
+
+The Perception Component receives inputs from Sensing, Localization, and Map components, and adds semantic information (e.g., Object Recognition, Obstacle Segmentation, Traffic Light Recognition, Occupancy Grid Map), which is then passed on to Planning Component. This component design follows the overarching philosophy of Autoware, defined as the [microautonomy concept](https://autowarefoundation.github.io/autoware-documentation/main/design/autoware-concepts/).
+
+## Goals and non-goals
+
+The role of the Perception Component is to recognize the surrounding environment based on the data obtained through Sensing and acquire sufficient information (such as the presence of dynamic objects, stationary obstacles, blind spots, and traffic signal information) to enable autonomous driving.
+
+In our overall design, we emphasize the concept of [microautonomy architecture](https://autowarefoundation.github.io/autoware-documentation/main/design/autoware-concepts). This term refers to a design approach that focuses on the proper modularization of functions, clear definition of interfaces between these modules, and as a result, high expandability of the system. Given this context, the goal of the Perception Component is set not to solve every conceivable complex use case (although we do aim to support basic ones), but rather to provide a platform that can be customized to the user's needs and can facilitate the development of additional features.
+
+To clarify the design concepts, the following points are listed as goals and non-goals.
+
+**Goals:**
+
+- To provide the basic functions so that a simple ODD can be defined.
+- To achieve a design that can provide perception functionality to every autonomous vehicle.
+- To be extensible with the third-party components.
+- To provide a platform that enables Autoware users to develop the complete functionality and capability.
+- To provide a platform that enables Autoware users to develop the autonomous driving system which always outperforms human drivers.
+- To provide a platform that enables Autoware users to develop the autonomous driving system achieving "100% accuracy" or "error-free recognition".
+
+**Non-goals:**
+
+- To develop the perception component architecture specialized for specific / limited ODDs.
+- To achieve the complete functionality and capability.
+- To outperform the recognition capability of human drivers.
+- To achieve "100% accuracy" or "error-free recognition".
+
+## High-level architecture
+
+This diagram describes the high-level architecture of the Perception Component.
+
+![overall-perception-architecture](image/high-level-perception-diagram.drawio.svg)
+
+The Perception Component consists of the following sub-components:
+
+- **Object Recognition**: Recognizes dynamic objects surrounding the ego vehicle in the current frame and predicts their future trajectories.
+- **Obstacle Segmentation**: Identifies point clouds originating from obstacles(not only dynamic objects but also static obstacles that should be avoided, such as stationary obstacles) that the ego vehicle should avoid.
+- **Occupancy Grid Map**: Detects blind spots (areas where no information is available and where dynamic objects may jump out).
+- **Traffic Light Recognition**: Recognizes the colors of traffic lights and the directions of arrow signals.
+
+## Component interface
+
+The following describes the input/output concept between Perception Component and other components. See [the Perception Component Interface](../../autoware-interfaces/components/perception.md) page for the current implementation.
+
+### Input to the Perception Component
+
+- **From Sensing**: This input should provide real-time information about the environment.
+ - Camera Image: Image data obtained from the camera.
+ - Point Cloud: Point Cloud data obtained from LiDAR.
+ - Radar Object: Object data obtained from radar.
+- **From Localization**: This input should provide real-time information about the ego vehicle.
+ - Vehicle motion information: Includes the ego vehicle's position.
+- **From Map**: This input should provide real-time information about the static information about the environment.
+ - Vector Map: Contains all static information about the environment, including lane aria information.
+ - Point Cloud Map: Contains static point cloud maps, which should not include information about the dynamic objects.
+- **From API**:
+ - V2X information: The information from V2X modules. For example, the information from traffic signals.
+
+### Output from the Perception Component
+
+- **To Planning**
+ - Dynamic Objects: Provides real-time information about objects that cannot be known in advance, such as pedestrians and other vehicles.
+ - Obstacle Segmentation: Supplies real-time information about the location of obstacles, which is more primitive than Detected Object.
+ - Occupancy Grid Map: Offers real-time information about the presence of occluded area information.
+ - Traffic Light Recognition result: Provides the current state of each traffic light in real time.
+
+## How to add new modules (WIP)
+
+As mentioned in the goal session, this perception module is designed to be extensible by third-party components. For specific instructions on how to add new modules and expand its functionality, please refer to the provided documentation or guidelines (WIP).
+
+## Supported Functions
+
+| Feature | Description | Requirements |
+| ---------------------------- | ----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- | --------------------------------------------------------------- |
+| LiDAR DNN based 3D detector | This module takes point clouds as input and detects objects such as vehicles, trucks, buses, pedestrians, and bicycles. | - Point Clouds |
+| Camera DNN based 2D detector | This module takes camera images as input and detects objects such as vehicles, trucks, buses, pedestrians, and bicycles in the two-dimensional image space. It detects objects within image coordinates and providing 3D coordinate information is not mandatory. | - Camera Images |
+| LiDAR Clustering | This module performs clustering of point clouds and shape estimation to achieve object detection without labels. | - Point Clouds |
+| Semi-rule based detector | This module detects objects using information from both images and point clouds, and it consists of two components: LiDAR Clustering and Camera DNN based 2D detector. | - Output from Camera DNN based 2D detector and LiDAR Clustering |
+| Object Merger | This module integrates results from various detectors. | - Detected Objects |
+| Interpolator | This module stabilizes the object detection results by maintaining long-term detection results using Tracking results. | - Detected Objects - Tracked Objects |
+| Tracking | This module gives ID and estimate velocity to the detection results. | - Detected Objects |
+| Prediction | This module predicts the future paths (and their probabilities) of dynamic objects according to the shape of the map and the surrounding environment. | - Tracked Objects - Vector Map |
+| Obstacle Segmentation | This module identifies point clouds originating from obstacles that the ego vehicle should avoid. | - Point Clouds - Point Cloud Map |
+| Occupancy Grid Map | This module detects blind spots (areas where no information is available and where dynamic objects may jump out). | - Point Clouds - Point Cloud Map |
+| Traffic Light Recognition | This module detects the position and state of traffic signals. | - Camera Images - Vector Map |
+
+## Reference Implementation
+
+When Autoware is launched, the default parameters are loaded, and the Reference Implementation is started. For more details, please refer to [the Reference Implementation](./reference_implementation.md).
diff --git a/docs/design/autoware-architecture/perception/reference_implementation.md b/docs/design/autoware-architecture/perception/reference_implementation.md
new file mode 100644
index 0000000000..313471ae16
--- /dev/null
+++ b/docs/design/autoware-architecture/perception/reference_implementation.md
@@ -0,0 +1,32 @@
+# Perception Component Reference Implementation Design
+
+## Purpose of this document
+
+This document outlines detailed design of the reference imprementations. This allows developers and users to understand what is currently available with the Perception Component, how to utilize, expand, or add to its features.
+
+## Architecture
+
+This diagram describes the architecture of the reference implementation.
+
+![overall-perception-architecture](image/reference-implementaion-perception-diagram.drawio.svg)
+
+The Perception component consists of the following sub-components:
+
+- **Obstacle Segmentation**: Identifies point clouds originating from obstacles(not only dynamic objects but also static obstacles that should be avoided, such as stationary obstacles) that the ego vehicle should avoid. For example, construction cones are recognized using this module.
+- **Occupancy Grid Map**: Detects blind spots (areas where no information is available and where dynamic objects may jump out).
+- **Object Recognition**: Recognizes dynamic objects surrounding the ego vehicle in the current frame and predicts their future trajectories.
+ - **Detection**: Detects the pose and velocity of dynamic objects such as vehicles and pedestrians.
+ - **Detector**: Triggers object detection processing frame by frame.
+ - **Interpolator**: Maintains stable object detection. Even if the output from Detector suddenly becomes unavailable, Interpolator uses the output from the Tracking module to maintain the detection results without missing any objects.
+ - **Tracking**: Associates detected results across multiple frames.
+ - **Prediction**: Predicts trajectories of dynamic objects.
+- **Traffic Light Recognition**: Recognizes the colors of traffic lights and the directions of arrow signals.
+
+### Internal interface in the perception component
+
+- **Obstacle Segmentation to Object Recognition**
+ - Point Cloud: A Point Cloud observed in the current frame, where the ground and outliers are removed.
+- **Obstacle Segmentation to Occupancy Grid Map**
+ - Ground filtered Point Cloud: A Point Cloud observed in the current frame, where the ground is removed.
+- **Occupancy Grid Map to Obstacle Segmentation**
+ - Occupancy Grid Map: This is used for filtering outlier.
diff --git a/docs/design/autoware-interfaces/components/perception.md b/docs/design/autoware-interfaces/components/perception.md
new file mode 100644
index 0000000000..a90ccd1142
--- /dev/null
+++ b/docs/design/autoware-interfaces/components/perception.md
@@ -0,0 +1,49 @@
+# Perception
+
+This page provides specific specifications about the Interface of the Perception Component. Please refer to [the perception architecture reference implementation design document](../../autoware-architecture/perception/reference_implementation.md) for concepts and data flow.
+
+## Input
+
+### From Map Component
+
+| Name | Topic / Service | Type | Description |
+| --------------- | ----------------------------------- | -------------------------------------------------------------------------------------------------------------------------------------------------------------------------- | -------------------------------------------- |
+| Vector Map | `/map/vector_map` | [autoware_auto_mapping_msgs/msg/HADMapBin](https://github.com/tier4/autoware_auto_msgs/blob/tier4/main/autoware_auto_mapping_msgs/msg/HADMapBin.idl) | HD Map including the information about lanes |
+| Point Cloud Map | `/service/get_differential_pcd_map` | [autoware_map_msgs/srv/GetDifferentialPointCloudMap](https://github.com/autowarefoundation/autoware_msgs/blob/main/autoware_map_msgs/srv/GetDifferentialPointCloudMap.srv) | Point Cloud Map |
+
+Notes:
+
+- Point Cloud Map
+ - input can be both topic or service, but we highly recommend to use service because since this interface enables processing without being constrained by map file size limits.
+
+### From Sensing Component
+
+| Name | Topic | Type | Description |
+| ------------ | ------------------------------------------ | --------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- | ---------------------------------------------------------------------- |
+| Camera Image | `/sensing/camera/camera*/image_rect_color` | [sensor_msgs/Image](https://github.com/ros2/common_interfaces/blob/humble/sensor_msgs/msg/Image.msg) | Camera image data, processed with Lens Distortion Correction (LDC) |
+| Camera Image | `/sensing/camera/camera*/image_raw` | [sensor_msgs/Image](https://github.com/ros2/common_interfaces/blob/humble/sensor_msgs/msg/Image.msg) | Camera image data, not processed with Lens Distortion Correction (LDC) |
+| Point Cloud | `/sensing/lidar/concatenated/pointcloud` | [sensor_msgs/PointCloud2](https://github.com/ros2/common_interfaces/blob/humble/sensor_msgs/msg/PointCloud2.msg) | Concatenated point cloud from multiple LiDAR sources |
+| Radar Object | `/sensing/radar/detected_objects` | [autoware_auto_perception_msgs/msg/DetectedObject](https://gitlab.com/autowarefoundation/autoware.auto/autoware_auto_msgs/-/blob/master/autoware_auto_perception_msgs/msg/DetectedObject.idl) | Radar objects |
+
+### From Localization Component
+
+| Name | Topic | Type | Description |
+| ---------------- | ------------------------------- | -------------------------------------------------------------------------------------------------------- | -------------------------- |
+| Vehicle Odometry | `/localization/kinematic_state` | [nav_msgs/msg/Odometry](https://github.com/ros2/common_interfaces/blob/humble/nav_msgs/msg/Odometry.msg) | Ego vehicle odometry topic |
+
+### From API
+
+| Name | Topic | Type | Description |
+| ------------------------ | --------------------------- | ---------------------------------------------------------------------------------------------------------------------------------------------------------------------- | ------------------------------------------- |
+| External Traffic Signals | `/external/traffic_signals` | [autoware_perception_msgs::msg::TrafficSignalArray](https://github.com/autowarefoundation/autoware_msgs/blob/main/autoware_perception_msgs/msg/TrafficSignalArray.msg) | The traffic signals from an external system |
+
+## Output
+
+### To Planning
+
+| Name | Topic | Type | Description |
+| ------------------ | ------------------------------------------------------- | ------------------------------------------------------------------------------------------------------------------------------------------------------------------------ | --------------------------------------------------------------------------------------------------------- |
+| Dynamic Objects | `/perception/object_recognition/objects` | [autoware_auto_perception_msgs/msg/PredictedObjects](https://github.com/tier4/autoware_auto_msgs/blob/tier4/main/autoware_auto_perception_msgs/msg/PredictedObjects.idl) | Set of dynamic objects with information such as a object class and a shape of the objects |
+| Obstacles | `/perception/obstacle_segmentation/pointcloud` | [sensor_msgs/PointCloud2](https://github.com/ros2/common_interfaces/blob/humble/sensor_msgs/msg/PointCloud2.msg) | Obstacles, which includes dynamic objects and static objetcs |
+| Occupancy Grid Map | `/perception/occupancy_grid_map/map` | [nav_msgs/msg/OccupancyGrid](https://docs.ros.org/en/latest/api/nav_msgs/html/msg/OccupancyGrid.html) | The map with the imformation about the presence of obstacles and blind spot |
+| Traffic Signal | `/perception/traffic_light_recognition/traffic_signals` | [autoware_perception_msgs::msg::TrafficSignalArray](https://github.com/autowarefoundation/autoware_msgs/blob/main/autoware_perception_msgs/msg/TrafficSignalArray.msg) | The traffic signal information such as a color (green, yellow, read) and an arrow (right, left, straight) |