Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix LogSegment static init after 3.8.0 #230

Merged
merged 1 commit into from
Sep 11, 2024
Merged

Conversation

ozangunalp
Copy link
Owner

Resolves #228

@k-wall I am not sure how I can reproduce the error case. but the generated bytecode is correct :

    private static final Logger LOGGER = LoggerFactory.getLogger(LogSegment.class);
    private static final Timer LOG_FLUSH_TIMER = null;
    private static final String FUTURE_DIR_SUFFIX = "-future";
    private static final Pattern FUTURE_DIR_PATTERN = Pattern.compile("^(\\S+)-(\\S+)\\.(\\S+)-future");

@k-wall
Copy link
Collaborator

k-wall commented Sep 10, 2024

Thanks, I'll give that a spin. If I understood the problem right (from @showuon), it would lead to a disk full issue.

@robobario
Copy link
Contributor

robobario commented Sep 10, 2024

The issue can be replicated by creating a topic with multiple segments in a partition (by pushing a few G of data through a 1 partition topic). Then delete the topic. It will get part way through deleting the first segment and then blow up while trying to handle the transaction file, leaving the rest of the segments in place.

docker run -p 19092:9092 -it --rm -e KAFKA_ADVERTISED_LISTENERS=PLAINTEXT://localhost:19092 quay.io/ogunalp/kafka-native:latest-kafka-3.8.0

and in another terminal:

./bin/kafka-topics.sh --create --topic test-topic --partitions 1 --bootstrap-server localhost:19092
./bin/kafka-producer-perf-test.sh --num-records 3000000 --record-size 1024 --throughput -1 --topic test-topic --producer-props bootstrap.servers=localhost:19092
./bin/kafka-topics.sh --delete --topic test-topic --bootstrap-server localhost:19092

server logs contain the NPE:

2024-09-10 21:18:09,721 ERROR [kaf.log.LogManager] (kafka-scheduler-0) Exception in kafka-delete-logs thread.: java.lang.NullPointerException
	at org.apache.kafka.storage.internals.log.LogSegment.deleteTypeIfExists(LogSegment.java:818)
	at org.apache.kafka.storage.internals.log.LogSegment.lambda$deleteIfExists$13(LogSegment.java:794)
	at org.apache.kafka.common.utils.Utils.tryAll(Utils.java:1168)
	at org.apache.kafka.storage.internals.log.LogSegment.deleteIfExists(LogSegment.java:790)
	at kafka.log.LocalLog$.$anonfun$deleteSegmentFiles$5(LocalLog.scala:927)
	at kafka.log.LocalLog$.$anonfun$deleteSegmentFiles$5$adapted(LocalLog.scala:926)
	at scala.collection.immutable.List.foreach(List.scala:334)
	at kafka.log.LocalLog$.$anonfun$deleteSegmentFiles$4(LocalLog.scala:926)

which occurs during the deletion of the first segment, and remaining segment files are left on disk:

docker exec ${CONTAINER_ID} du -h target/log-dir/
44K	target/log-dir/__cluster_metadata-0
1.9G	target/log-dir/test-topic-0.0b5f446557754207a7229e01e17754f5-delete
1.9G	target/log-dir/

@robobario
Copy link
Contributor

robobario commented Sep 10, 2024

The NPE can be provoked just by creating and deleting a topic (and waiting a while for the scheduled file deletion job to kick in), the above shows how deleted topic data gets left uncleaned.

@k-wall
Copy link
Collaborator

k-wall commented Sep 11, 2024

I'll give the fix a spin now

@ozangunalp
Copy link
Owner Author

IIRC the log cleaner delay is configured to a lower value than default. Maybe it'd be easy to write an IT to check this.

@k-wall
Copy link
Collaborator

k-wall commented Sep 11, 2024

Looks good to me. The NPE is gone and the topic's directory gets removed by the cleaner.

2024-09-11 08:52:47,212 INFO  [kaf.log.LocalLog$] (kafka-scheduler-0) [LocalLog partition=foo-0, dir=/Users/kwall/src/kafka-native/kafka-server/./target/log-dir] Deleting segment files LogSegment(baseOffset=0, size=0, lastModifiedTime=1726041158474, largestRecordTimestamp=-1)
2024-09-11 08:52:47,213 INFO  [org.apa.kaf.sto.int.log.LogSegment] (kafka-scheduler-0) Deleted log /Users/kwall/src/kafka-native/kafka-server/./target/log-dir/foo-0.d6dec36f0bd2440383be96c0312bfec4-delete/00000000000000000000.log.deleted.
2024-09-11 08:52:47,213 INFO  [org.apa.kaf.sto.int.log.LogSegment] (kafka-scheduler-0) Deleted offset index /Users/kwall/src/kafka-native/kafka-server/./target/log-dir/foo-0.d6dec36f0bd2440383be96c0312bfec4-delete/00000000000000000000.index.deleted.
2024-09-11 08:52:47,213 INFO  [org.apa.kaf.sto.int.log.LogSegment] (kafka-scheduler-0) Deleted time index /Users/kwall/src/kafka-native/kafka-server/./target/log-dir/foo-0.d6dec36f0bd2440383be96c0312bfec4-delete/00000000000000000000.timeindex.deleted.
2024-09-11 08:52:47,213 INFO  [kaf.log.LogManager] (kafka-scheduler-0) Deleted log for partition foo-0 in /Users/kwall/src/kafka-native/kafka-server/./target/log-dir/foo-0.d6dec36f0bd2440383be96c0312bfec4-delete.

kwall@Oslo kafka-server %
kwall@Oslo kafka-server %
kwall@Oslo kafka-server % ls -la target/log-dir/
total 40
drwxr-xr-x  10 kwall  staff  320 11 Sep 08:52 .
drwxr-xr-x  17 kwall  staff  544 11 Sep 08:51 ..
-rw-r--r--   1 kwall  staff    0 11 Sep 08:51 .lock
drwxr-xr-x   8 kwall  staff  256 11 Sep 08:51 __cluster_metadata-0
-rw-r--r--   1 kwall  staff  249 11 Sep 08:51 bootstrap.checkpoint
-rw-r--r--   1 kwall  staff    4 11 Sep 08:52 cleaner-offset-checkpoint
-rw-r--r--   1 kwall  staff    4 11 Sep 08:52 log-start-offset-checkpoint
-rw-r--r--   1 kwall  staff  124 11 Sep 08:51 meta.properties
-rw-r--r--   1 kwall  staff    4 11 Sep 08:52 recovery-point-offset-checkpoint
-rw-r--r--   1 kwall  staff    0 11 Sep 08:51 replication-offset-checkpoint

I'd like to spin a defect fix release (0.11.1) once this lands.

@ozangunalp ozangunalp merged commit a21975f into main Sep 11, 2024
1 check passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

File failed to get deleted
3 participants