Hopsworks
Bug
HWORKS-224 hopsworks python SDK opensearch link broken
HWORKS-853 Model export fails in integration tests
HWORKS-862 Remote clients are not available on non das nodes
HWORKS-864 Istio doesn't use the docker images from the registry
HWORKS-865 NullPointerException when monitoring execution of a deleted job
HWORKS-869 Email validation regex doesn't support capital letters
HWORKS-872 Cannot disable oauth group mapping from cluster definition
HWORKS-878 Grafana uses wrong prometheus name
HWORKS-879 Upload will fail if the same file is uploaded again while the first upload is ongoing.
HWORKS-881 Validate SparkJob and Python job application file exists before launching job
HWORKS-882 Disallow exporting a model in the root path of a dataset
HWORKS-890 Restoring RonDB backup breaks because restore-data does not allow restoring in a table with unique indexes
HWORKS-898 PyTorch installation does not have GPU support
HWORKS-899 Broken image links in gh-pages documentation
HWORKS-901 Git installation support should not prepend git+
HWORKS-907 CloudManager should blacklist all instances from automatic removal
HWORKS-908 NPE in ProjectQouataController for a project that failed to get created
HWORKS-910 Endpoint identification alg missing from kafka get_default_config()
HWORKS-950 git pull fails if the user has multiple surnames
HWORKS-954 api_key for cert-operator is created empty
HWORKS-961 Do not turn on MySQL binlog if global replication is not enabled
HWORKS-974 Resource guard in ndb-chef for extracting RonDB does not work
HWORKS-975 OpenSearchApi in hopsworks-api should not rely on ELASTIC_ENDPOINT
HWORKS-976 Replacement of CREATE USER with CREATE USER IF NOT EXISTS is wrong in RonDB backup restore script
HWORKS-996 Removal of rotated one time JWT signing key should happen only on the Primary node
HWORKS-1001 Feature store sharing should also share data validation dataset
HWORKS-1029 Pin nbconvert in docker images
Task
HWORKS-135 Models backend should store metadata in tables instead of opensearch
HWORKS-302 Enhancement Request: Ability to disable access to Anaconda package repository and hide the functionality at a system defined level
HWORKS-707 Store descriptive statistics in NDB instead of HopsFS
HWORKS-802 Python 3.12 support for clients
HWORKS-808 add database/url to parameter in kafka cookbook
HWORKS-854 hops-system should use the minimal env definition without pydoop
HWORKS-855 Remove Kagent ability to do service key rotation
HWORKS-861 Disable Kafka alerting rules when bring your own kafka is enabled
HWORKS-868 Increase yarn.resourcemanager.rmappsecurity.jwt.validity to 1 hour
HWORKS-871 Hopsworks should be able to handle oauth claims in array format
HWORKS-873 Add flag to delete RonDB backup with the same backup-id
HWORKS-877 Helm chart for certs-operator
HWORKS-880 Model registry tests should account for if model is stored on hdfs and also run in PySpark
HWORKS-883 Fix Sonarqube issues in CommandsController
HWORKS-886 OnlineFS should be able to read Kafka configuration from environment variables
HWORKS-887 certs-operator deploy appropriate certificate for Strimzi operator
HWORKS-889 Chef flag to disable unattended upgrades
HWORKS-891 Add K8s GPUs monitoring to Grafana/Jupyter/Job UI
HWORKS-892 Use a single YarnClientWrapper to monitor all jobs
HWORKS-893 Nightly tests should run without installing the requirements.txt
HWORKS-894 Add pytest to python base environment
HWORKS-895 Upgrade xgboost to 2.0.3
HWORKS-897 Upgrade project python environment libraries
HWORKS-900 Upgrade to PyTorch 2.1.2
HWORKS-909 Add hopsworks-api opensearch and kafka workflow tests
HWORKS-912 Upload staging dir does not need to be configurable.
HWORKS-919 Create database on the online feature store on-demand
HWORKS-934 Use more appropriate data types for statistics
HWORKS-944 Pin upper pandas to 2.1.4
HWORKS-946 Global Chef attribute for arbitrary Systemd unit dependencies
HWORKS-949 Add git repo pull to workflow tests
HWORKS-962 Purge binlog files when Global replication is enabled
HWORKS-965 In some OSes DNS resolution does not work from within a Kubernetes Pod
HWORKS-973 Share feature store with project on creation
HWORKS-977 Mount hopsfs by default in jupyter notebooks and python jobs containers
HWORKS-980 Don't set DB storagePolicy for Hive warehouse location
HWORKS-992 Remove code for generating service Renewal JWTs
HWORKS-1050 Run expat as glassfish hopsfs user
Feature Store
Epic
FSTORE-612 Feature Monitoring
Bug
FSTORE-830 Error fetching feature statistics from feature view UI - but statistics exists with a different timestamp
FSTORE-856 Only one OnlineFS instance running
FSTORE-987 Failed to read data when there is a self-join
FSTORE-989 GCS connector-Encryption fields and secrets update issues
FSTORE-992 create_train_validation_test_split fails with unexpected keyword argument 'pit_query_asof'
FSTORE-998 Can't read from a shared feature store
FSTORE-1000 Tags parameter is missing in TrainingDataset class
FSTORE-1034 .select() method should not default to empty list
FSTORE-1035 No error if a user tries to create a feature view without features
FSTORE-1084 Cannot run multiple insert_stream query on the same project by default
FSTORE-1089 Remove copying of application code from databricks integration
FSTORE-1095 JDBC storage connector missing driver option in the documentation
FSTORE-1099 Nightly test test_append_feature failing on float type mistmatch
FSTORE-1100 Improve OnlineFS offset saving
FSTORE-1110 Recreation of Training dataset throwing exception due to NULL reference
FSTORE-1111 OnlineFS monitoring shows wrong clusterj session count
FSTORE-1118 Pandas arrow type dataframes cannot be inserted into feature group
FSTORE-1120 get_feature_vector(s) init fails if feature view contains complex feature
FSTORE-1122 QueryController issue when joining one feature group multiple times
FSTORE-1126 python mysql client sometimes failed when user name is too long
FSTORE-1138 Java client is not designed to work with shared feature store
FSTORE-1139 Get feature store requests can throw NullPointerException is the project is not properly initialized
FSTORE-1140 OnlineFS loads wrong kafka property files
FSTORE-1144 Hopsworks should not change the Access Mode of the Databricks cluster
FSTORE-1146 Appending lots of features results in error to commit activity update
FSTORE-1154 OnlineFS onPartitionsAssigned ConcurrentModificationException
FSTORE-1156 Fix FeatureView.clean() code snippet to use the static method
FSTORE-1161 Arrowflight server hangs when instantiating a FlightServer instance and cannot validate the certificates.
FSTORE-1174 Spark streaming workflow test failing
FSTORE-1181 Helper columns should return all columns if they have different names across feature groups
FSTORE-1184 Kafka storage connector confluent_options method uses wrong certificates.
FSTORE-1187 Execution failing with framework_failure causes insert to hang forever
FSTORE-1189 Arrowflight `tls` option parsing the wrong number of arguments
FSTORE-1193 HSFS python tests fail after moto has been updated
FSTORE-1199 Training dataset info objects cannot be retrieved within a job
FSTORE-1203 'FeatureGroup' object has no attribute 'time_travel_format'
FSTORE-1230 fg.select_all().read() on PySpark does not return latest version of a Feature Group
FSTORE-1233 AML tutorial replace append (removed from Pandas) and fix grammar errors
FSTORE-1252 In Neo4j tutorial. fix feature group order in query when creating feature view
FSTORE-1253 Create training test split without materialization fails when during computing statistics when Feature Group contains certain types.
FSTORE-1272 Statistics cleaner should not delete statistics during migration
Subtask
FSTORE-1132 Enable filtering in find_neighbors
FSTORE-1133 Enable deleting embedding index
FSTORE-1134 Enable updating embedding
FSTORE-1136 Add documentation for similarity search in hsfs
FSTORE-1164 Support composite primary key in similarity search
FSTORE-1276 Use a larger value of k in find_neighbors when using project index
Task
FSTORE-878 Integrate Docsbot into Docs
FSTORE-885 Catch no data error when fetching dataframe in feature monitoring
FSTORE-905 Add tests for training dataset statistics computation
FSTORE-951 Increase test coverage for feature monitoring
FSTORE-971 Do not compute statistics on in-memory training datasets
FSTORE-1021 Add a warning when using externally managed Kafka
FSTORE-1022 Missing data in featurestore benchmark
FSTORE-1042 Investigate increasing time to make hudi commits
FSTORE-1046 Improve online feature store metrics
FSTORE-1052 Update the docs to include Kafka config variables for throughput
FSTORE-1055 Support pandas 2.1.*
FSTORE-1060 Add load tests that read feature groups and feature views
FSTORE-1063 Hopsworks data preview should use arrow flight to retrieve data if available
FSTORE-1070 Tutorial for external Flink client
FSTORE-1074 Expand documentation on filter logic
FSTORE-1090 Concepts & Guides for helper columns and on-demand features
FSTORE-1097 Add user id to workflow test and unit test
FSTORE-1103 AML Tutorial
FSTORE-1104 Pin numpy < 2
FSTORE-1106 inconsistencies in training dataset documentation
FSTORE-1112 Support Similarity Search in the Feature Store v1 - OnlineFS
FSTORE-1113 Support Similarity Search in the Feature Store v1 - HSFS
FSTORE-1115 HSFS should be able to read data from Databricks Unity Catalog
FSTORE-1116 Parallelize PK lookups for get feature vectors
FSTORE-1119 Support Similarity Search in the Feature Store v1 - migration
FSTORE-1121 Support Similarity Search in the Feature Store v1 - onlinefs monitoring
FSTORE-1124 [OnlineFs] Subject id should add to black list if feature group is not found
FSTORE-1125 Add retry when vector db is empty in load test
FSTORE-1127 Fix log4j vulnerability
FSTORE-1128 Fallback to head node host in opensearch api
FSTORE-1130 Get schema from shared project
FSTORE-1131 DeltaStreamer job fails when feature group has complex feature
FSTORE-1142 Remove markupsafe<2.1.0 pinning
FSTORE-1143 Bump fastavro to 1.8.4 to install using provided wheel in python 3.12 environments
FSTORE-1147 Online feature store notification system
FSTORE-1148 Python to Kafka writing analysis and benchmarking
FSTORE-1152 Add support for listing training datasets metadata from a feature view object
FSTORE-1166 Add backend support for Delta as time travel format
FSTORE-1167 Add Api support for delta time travel format
FSTORE-1168 Update databricks instance configurator to configure cluster for delta
FSTORE-1175 Add FeatureGroup.read_changes(..) and FeatureGroup.as_of(..) workflow tests
FSTORE-1182 Extend spark-no-metastore engine to compute statistics and remove calling use database
FSTORE-1196 Neo4j tutorial
FSTORE-1213 Remove support for configuring databricks instances from hsfs API
FSTORE-1259 Add Change Notification for Feature Groups Example