-
Notifications
You must be signed in to change notification settings - Fork 1.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Implement SSE3 support in V86 #906
Conversation
nice :/ |
SSE3 SUPPORT IN V86 WE DID IT BOYS!! @copy check it out :D |
hmm... but why doesn't sse3 show up on |
Good question. It does show up in other OSes (e.g. KolibriOS). |
Did you test Windows 8 and/or Android with this? |
Don't have time rn, will do it later (I use this account on my phone) |
(here i am btw) |
Agreed, in CPU-Z detects SSE3, but in Linux's /proc/cpuinfo pni (Prescott New Instructions, second name of SSE3) is not detected Also, Alpine Linux 3.18.2 (iso) won't booting for me (on 5664eea worked properly): Log[ 62.679000] rcu: INFO: rcu_preempt detected stalls on CPUs/tasks: [ 62.679000] (detected by 0, t=6002 jiffies, g=-1195, q=1 ncpus=1) [ 62.679000] rcu: All QSes seen, last rcu_preempt kthread activity 6002 (-23924--29926), jiffies_till_next_fqs=1, root ->qsmask 0x0 [ 62.679000] rcu: rcu_preempt kthread starved for 6002 jiffies! g-1195 f0x2 RCU_GP_WAIT_FQS(5) ->state=0x0 ->cpu=0 [ 62.679000] rcu: Unless rcu_preempt kthread gets sufficient CPU time, OOM is now expected behavior. [ 62.679000] rcu: RCU grace-period kthread stack dump: [ 62.679000] task:rcu_preempt state:R running task stack:0 pid:15 ppid:2 flags:0x00004000 [ 62.679000] Call Trace: [ 62.679000] __schedule+0x38d/0x1140 [ 62.679000] schedule+0x56/0xf0 [ 62.679000] schedule_timeout+0x6c/0x110 [ 62.679000] ? __bpf_trace_tick_stop+0x30/0x30 [ 62.679000] rcu_gp_fqs_loop+0xfa/0x450 [ 62.679000] ? rcu_gp_init+0x2a2/0x4d0 [ 62.679000] rcu_gp_kthread+0xc5/0x170 [ 62.679000] kthread+0xd5/0x100 [ 62.679000] ? rcu_gp_init+0x4d0/0x4d0 [ 62.679000] ? kthread_complete_and_exit+0x20/0x20 [ 62.679000] ret_from_fork+0x1c/0x28 [ 62.679000] rcu: Stack dump where RCU GP kthread last ran: [ 62.679000] CPU: 0 PID: 1 Comm: swapper/0 Not tainted 6.1.51-0-virt #1-Alpine [ 62.679000] EIP: delay_tsc+0x6a/0xd0 [ 62.679000] Code: 00 39 d8 75 3b 0f ae e8 0f 31 8b 4d ec 89 c6 89 d7 2b 45 e4 1b 55 e8 39 c8 89 d0 1b 45 f0 73 38 b8 01 00 00 00 e8 66 2e 97 ff <64> a1 34 08 c0 ca 85 c0 75 bc e8 f7 2c 04 00 eb b5 8d 74 26 00 90 [ 62.679000] EAX: 00000000 EBX: 00000000 ECX: 00000000 EDX: 00000000 [ 62.679000] ESI: c36c4a80 EDI: 00000014 EBP: c10cff08 ESP: c10cfeec [ 62.679000] DS: 007b ES: 007b FS: 00d8 GS: 0000 SS: 0068 EFLAGS: 00000246 [ 62.679000] CR0: 80050033 CR2: ff7ff000 CR3: 0ac16000 CR4: 000006b0 [ 62.679000] DR0: 00000000 DR1: 00000000 DR2: 00000000 DR3: 00000000 [ 62.679000] DR6: 00000000 DR7: 00000000 [ 62.679000] Call Trace: [ 62.679000] [ 62.679000] ? show_regs.cold+0x14/0x1a [ 62.679000] ? dump_cpu_task+0x61/0x70 [ 62.679000] ? rcu_check_gp_kthread_starvation.cold+0x157/0x15d [ 62.679000] ? rcu_sched_clock_irq+0x9a1/0x9d0 [ 62.679000] ? update_process_times+0x63/0x90 [ 62.679000] ? tick_periodic+0x31/0x90 [ 62.679000] ? tick_handle_periodic+0x1e/0x70 [ 62.679000] ? get_stack_info+0x140/0x140 [ 62.679000] ? timer_interrupt+0xf/0x20 [ 62.679000] ? __handle_irq_event_percpu+0x3b/0x160 [ 62.679000] ? handle_irq_event+0x29/0x70 [ 62.679000] ? handle_fasteoi_nmi+0x100/0x100 [ 62.679000] ? handle_level_irq+0x8f/0x170 [ 62.679000] ? __handle_irq+0x9e/0xc0 [ 62.679000] [ 62.679000] ? __common_interrupt+0x54/0xf0 [ 62.679000] ? common_interrupt+0x34/0x50 [ 62.679000] ? asm_common_interrupt+0x102/0x140 [ 62.679000] ? delay_tsc+0x6a/0xd0 [ 62.679000] ? delay_tsc+0x6a/0xd0 [ 62.679000] __const_udelay+0x2e/0x40 [ 62.679000] test_nmi_ipi.constprop.0+0x94/0xe1 [ 62.679000] local_ipi+0x36/0x52 [ 62.679000] dotest.constprop.0+0xc/0xb6 [ 62.679000] nmi_selftest+0x7e/0x19e [ 62.679000] native_smp_cpus_done+0x8a/0x191 [ 62.679000] smp_init+0x9f/0xbc [ 62.679000] kernel_init_freeable+0x13b/0x291 [ 62.679000] ? rest_init+0xc0/0xc0 [ 62.679000] kernel_init+0x12/0x100 [ 62.679000] ret_from_fork+0x1c/0x28 tested commit: 6123197 |
aaaaaaand the ci has failed, looks like some issue with the fpu :/ |
then theres smth wrong with the fisttp implementation, idk what else lmao |
I think this CI thing has an error:
|
That's a bug in the code, |
works now? |
@copy sorry for being inpatient, but can you merge this pull request, since we have sse3 support? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
sorry for being inpatient, but can you merge this pull request, since we have sse3 support?
I cleared the cache, because the nasmtests needed to be generated, and it found a couple of bugs.
src/rust/cpu/fpu.rs
Outdated
#[no_mangle] | ||
pub unsafe fn fpu_fisttpm64(addr: i32) { | ||
return_on_pagefault!(writable_or_pagefault(addr, 8)); | ||
let v = fpu_convert_to_i64(fpu_get_st0()); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Use .trunc()
instead of fpu_convert_to_i64
(also in the other functions).
src/rust/cpu/instructions_0f.rs
Outdated
destination.f32[0] + destination.f32[1], | ||
destination.f32[2] + destination.f32[3], | ||
source.f32[0] + source.f32[1], | ||
source.f32[2] + source.f32[3], |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This implementation isn't correct. See https://www.felixcloutier.com/x86/movshdup
src/rust/jit_instructions.rs
Outdated
@@ -5684,7 +5693,26 @@ pub fn instr_F30F10_reg_jit(ctx: &mut JitContext, r1: u32, r2: u32) { | |||
ctx.builder | |||
.store_aligned_i32(global_pointers::get_reg_xmm_offset(r2)); | |||
} | |||
|
|||
pub fn instr_F30F12_mem_jit(ctx: &mut JitContext, modrm_byte: ModrmByte, r: u32) { | |||
instr_660F6E_mem_jit(ctx, modrm_byte, r) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This implementation isn't correct. See https://www.felixcloutier.com/x86/movsldup
src/rust/cpu/instructions_0f.rs
Outdated
destination.f32[0] + destination.f32[1], | ||
destination.f32[2] + destination.f32[3], | ||
source.f32[0] + source.f32[1], | ||
source.f32[2] + source.f32[3], |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
src/rust/cpu/instructions_0f.rs
Outdated
destination.f32[0] + destination.f32[1], | ||
destination.f32[2] + destination.f32[3], | ||
source.f32[0] + source.f32[1], | ||
source.f32[2] + source.f32[3], |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ok, will work on this when I'm home |
(or maybe at school lmfao) |
we got it down to 9 errors :) |
if this aint working then idk lmao
when will this be merged |
Merged, cheers. |
Well done, congratulations @spetterman66 ! |
You would have never thought a web dev could implement sse3 in an emulator :D |
No description provided.