[solved] kworker is killing my machine again ...

Questions about software.
User avatar
ilu
Posts: 2526
Joined: 09 Oct 2013 12:45

Re: [solved ... and then not !!!] kworker is killing my machine again ...

Postby ilu » 28 Oct 2018 17:49

bpo = backports

User avatar
amnesix
Posts: 176
Joined: 09 Nov 2013 12:46
Location: Berlin
Contact:

Re: [solved ... and then not !!!] kworker is killing my machine again ...

Postby amnesix » 30 Oct 2018 07:21

bpo = backports
Ho ... :) I could have guessed that by myself ....

But I can't stay by 4.9.17. It won't allow Thunderbird to launch.

So I'm back with 4.9.08, and I have one or two incidents pro day. Bugger.

User avatar
ilu
Posts: 2526
Joined: 09 Oct 2013 12:45

Re: [solved ... and then not !!!] kworker is killing my machine again ...

Postby ilu » 30 Oct 2018 11:38

I'm now using 4.18.0-16.1-liquorix-amd64 from Steven Pussers Liquorix backport for stretch and Thunderbird doesn't have any launch problems. Actually, no program has launch problems. Every liquorix kernel starting with 4.11 worked on my otherwise SolydX stable system.

User avatar
amnesix
Posts: 176
Joined: 09 Nov 2013 12:46
Location: Berlin
Contact:

Re: [solved ... and then not !!!] kworker is killing my machine again ...

Postby amnesix » 01 Nov 2018 18:15

I have now the same Liquorix Kernel as Ilu. 4.18.0-16.1-liquorix-amd64.

Question : should still have the «echo 10000 > /proc/sys/vm/dirty_writeback_centisecs» active, or not ?

User avatar
amnesix
Posts: 176
Joined: 09 Nov 2013 12:46
Location: Berlin
Contact:

Re: [solved ... and then not !!!] kworker is killing my machine again ...

Postby amnesix » 01 Nov 2018 18:25

Well ... first «Kworker-attack» after 5 minutes. Not good.

User avatar
ilu
Posts: 2526
Joined: 09 Oct 2013 12:45

Re: [solved ... and then not !!!] kworker is killing my machine again ...

Postby ilu » 01 Nov 2018 18:45

If kernel developers don't know what to do we also won't be able to fix this. If the liquorix kernel doesn't make things worse, stick with it. Have you added the repo? That way if the problem gets fixed you'll get the fix right away.

Have you tried one of the tools mentioned here: https://itsfoss.com/reduce-overheating-laptops-linux/? Just don't follow the installation instructions for Ubuntu (no ppa please!).I haven't checked instructions for Debian, if you run into problems, report back. Try one tool after the other unless they are compatible. At least the tools could help to monitor the problem.

User avatar
amnesix
Posts: 176
Joined: 09 Nov 2013 12:46
Location: Berlin
Contact:

Re: [solved ... and then not !!!] kworker is killing my machine again ...

Postby amnesix » 01 Nov 2018 18:52

I have installed (right now) thermald (it is accessible through the repositories). I'll post an update. Thank you !

User avatar
amnesix
Posts: 176
Joined: 09 Nov 2013 12:46
Location: Berlin
Contact:

Re: [solved ... and then not !!!] kworker is killing my machine again ...

Postby amnesix » 05 Nov 2018 15:05

My problem evolves. Right now, I have a Kworker- process that hogs my processors, but with a clue ; «kworker/0:3 kacpi_notifiy», instead of the usual «kworker:0». Is really a clue ?

User avatar
ilu
Posts: 2526
Joined: 09 Oct 2013 12:45

Re: [solved ... and then not !!!] kworker is killing my machine again ...

Postby ilu » 05 Nov 2018 23:15

Another approach: Use synaptic to install linux-tools and linux-perf. This will install the version for Kernel 4.9 (they must match), so now reboot and use grub menu to boot into the standard stretch 4.9 kernel and wait for the problem to start. Then:
https://askubuntu.com/questions/33640/kworker-what-is-it-and-why-is-it-hogging-so-much-cpu wrote: Record some 10 seconds of backtraces on all your CPUs: sudo perf record -g -a sleep 10
Analyse your recording: sudo perf report
(Navigate the call graph with ←, →, ↑, ↓ and Enter.)
Post the perf output. I haven't tried this because I was too lazy to boot into the 4.9 kernel.

User avatar
amnesix
Posts: 176
Joined: 09 Nov 2013 12:46
Location: Berlin
Contact:

Re: [solved ... and then not !!!] kworker is killing my machine again ...

Postby amnesix » 07 Nov 2018 22:30

Record some 10 seconds of backtraces on all your CPUs: sudo perf record -g -a sleep 10
Should I do it while my machine is in normal state, or in crisis mode ?

User avatar
amnesix
Posts: 176
Joined: 09 Nov 2013 12:46
Location: Berlin
Contact:

Re: [solved ... and then not !!!] kworker is killing my machine again ...

Postby amnesix » 07 Nov 2018 23:13

The file is too big to attach. Even with 5 seconds. (9 and 6 Mb)

I can see that a «normal mode» report for 10 seconds is 4.4 MB, when in «crisis» mode for 10 seconds, it is 9.6MB

Here is at least an excerpt

Code: Select all

+   68.19%     0.12%  kworker/0:2      [kernel.kallsyms]                   [k] __switch_to_asm                                                                                                ▒
+   68.07%     0.12%  kworker/0:2      [kernel.kallsyms]                   [k] worker_thread                                                                                                  ▒
+   68.07%     0.00%  kworker/0:2      [kernel.kallsyms]                   [k] kthread                                                                                                        ▒
+   68.07%     0.00%  kworker/0:2      [kernel.kallsyms]                   [k] ret_from_fork                                                                                                  ▒
+   67.70%     0.22%  kworker/0:2      [kernel.kallsyms]                   [k] process_one_work                                                                                               ▒
+   66.03%     0.06%  kworker/0:2      [kernel.kallsyms]                   [k] acpi_os_execute_deferred                                                                                       ▒
+   60.75%     0.11%  kworker/0:2      [kernel.kallsyms]                   [k] acpi_ev_asynch_execute_gpe_method                                                                              ▒
+   60.16%     0.07%  kworker/0:2      [kernel.kallsyms]                   [k] acpi_ns_evaluate                                                                                               ▒
+   57.16%     0.07%  kworker/0:2      [kernel.kallsyms]                   [k] acpi_ps_execute_method                                                                                         ▒
+   55.25%     0.10%  kworker/0:2      [kernel.kallsyms]                   [k] acpi_ps_parse_aml                                                                                              ▒
+   50.92%     1.58%  kworker/0:2      [kernel.kallsyms]                   [k] acpi_ps_parse_loop                                                                                             ▒
+   27.01%     1.23%  kworker/0:2      [kernel.kallsyms]                   [k] acpi_ds_exec_end_op                                                                                            ▒
+   12.85%     0.23%  swapper          [kernel.kallsyms]                   [k] cpu_startup_entry                                                                                              ▒
+   11.63%     0.28%  kworker/0:2      [kernel.kallsyms]                   [k] acpi_ps_get_next_namepath                                                                                      ▒
+   10.65%     0.74%  kworker/0:2      [kernel.kallsyms]                   [k] acpi_ns_lookup                                                                                                 ▒
+   10.40%     4.42%  kworker/0:2      [kernel.kallsyms]                   [k] kmem_cache_alloc                                                                                               ▒
+    9.24%     0.36%  kworker/0:2      [kernel.kallsyms]                   [k] acpi_ns_search_and_enter                                                                                       ▒
+    8.59%     8.35%  kworker/0:2      [kernel.kallsyms]                   [k] acpi_ns_search_one_scope                                                                                       ▒
+    8.54%     0.21%  kworker/0:2      [kernel.kallsyms]                   [k] down_timeout                                                                                                   ▒
+    7.96%     0.23%  kworker/0:2      [kernel.kallsyms]                   [k] acpi_os_release_object                                                                                         ▒
+    7.52%     0.12%  swapper          [kernel.kallsyms]                   [k] cpuidle_enter_state                                                                                            ▒
+    7.50%     0.55%  kworker/0:2      [kernel.kallsyms]                   [k] acpi_ds_create_operand                                                                                         ▒
+    6.98%     6.98%  kworker/0:2      [kernel.kallsyms]                   [k] acpi_os_write_port                                                                                             ▒
+    6.71%     0.00%  swapper          [kernel.kallsyms]                   [k] start_secondary                                                                                                ▒
+    6.15%     1.08%  kworker/0:2      [kernel.kallsyms]                   [k] acpi_ps_create_op                                                                                              ▒
+    6.14%     0.00%  swapper          [kernel.kallsyms]                   [k] start_kernel                                                                                                   ▒
+    6.14%     0.00%  swapper          [kernel.kallsyms]                   [k] early_idt_handler_common                                                                                       ▒
+    6.14%     0.00%  swapper          [kernel.kallsyms]                   [k] x86_64_start_kernel                                                                                            ▒
+    6.11%     0.41%  kworker/0:2      [kernel.kallsyms]                   [k] acpi_ut_create_generic_state                                                                                   ▒
+    5.79%     0.05%  kworker/0:2      [kernel.kallsyms]                   [k] acpi_ex_store                                                                                                  ▒
+    5.71%     0.31%  kworker/0:2      [kernel.kallsyms]                   [k] acpi_ds_create_operands                                                                                        ▒
+    5.68%     0.12%  kworker/0:2      [kernel.kallsyms]                   [k] acpi_ex_opcode_1A_1T_1R                                                                                        ▒
+    5.66%     0.05%  kworker/0:2      [kernel.kallsyms]                   [k] acpi_ex_store_object_to_node                                                                                   ▒
+    5.65%     0.20%  kworker/0:2      [kernel.kallsyms]                   [k] acpi_ex_field_datum_io                                                                                         ▒
+    5.49%     0.29%  kworker/0:2      [kernel.kallsyms]                   [k] acpi_ex_insert_into_field                                                                                      ▒
+    5.42%     0.05%  kworker/0:2      [kernel.kallsyms]                   [k] acpi_ev_asynch_enable_gpe                                                                                      ▒
+    5.38%     0.14%  kworker/0:2      [kernel.kallsyms]                   [k] acpi_ex_write_data_to_field                                                                                    ▒
+    5.17%     0.29%  kworker/0:2      [kernel.kallsyms]                   [k] acpi_ex_access_region                                                                                          ▒
+    5.06%     0.00%  kworker/0:2      [kernel.kallsyms]                   [k] acpi_ev_finish_gpe                                                                                             ▒
+    4.99%     0.90%  kworker/0:2      [kernel.kallsyms]                   [k] acpi_ut_update_object_reference                                                                                ▒
+    4.96%     0.46%  kworker/0:2      [kernel.kallsyms]                   [k] acpi_ps_complete_op                                                                                            ▒
+    4.83%     0.42%  kworker/0:2      [kernel.kallsyms]                   [k] acpi_ev_address_space_dispatch                                                                                 ▒
+    4.66%     0.03%  kworker/0:2      [kernel.kallsyms]                   [k] acpi_ds_evaluate_name_path                                                                                     ▒
+    4.52%     0.00%  swapper          [kernel.kallsyms]                   [k] ret_from_intr                                                                                                  ▒
+    4.52%     0.01%  swapper          [kernel.kallsyms]                   [k] do_IRQ                                                                                                         ▒
+    4.45%     0.71%  kworker/0:2      [kernel.kallsyms]                   [k] acpi_ex_resolve_to_value                                                                                       ▒
+    4.43%     0.07%  kworker/0:2      [kernel.kallsyms]                   [k] acpi_hw_write_port                                                                                             ▒
+    4.37%     0.05%  kworker/0:2      [kernel.kallsyms]                   [k] acpi_ex_write_with_update_rule                                                                                 ▒
+    4.32%     0.00%  swapper          [kernel.kallsyms]                   [k] handle_irq                                                                                                     ▒
+    4.31%     0.01%  swapper          [kernel.kallsyms]                   [k] handle_fasteoi_irq                                                                                             ▒
+    4.28%     0.00%  swapper          [kernel.kallsyms]                   [k] handle_irq_event                                                                                               ▒


User avatar
ilu
Posts: 2526
Joined: 09 Oct 2013 12:45

Re: [solved ... and then not !!!] kworker is killing my machine again ...

Postby ilu » 11 Nov 2018 02:44

In crisis mode. Have a look at the askubuntu posting about perf by tanius in https://askubuntu.com/questions/33640/k ... o-much-cpu. I forgot step 3 after logging the crisis, which is: sudo perf report. It seems to result in a graph. I have never seen such a graph, so I can't tell you what to expect. Maybe you can screenshot it.

User avatar
amnesix
Posts: 176
Joined: 09 Nov 2013 12:46
Location: Berlin
Contact:

Re: [solved ... and then not !!!] kworker is killing my machine again ...

Postby amnesix » 10 Aug 2019 07:24

The beast is dead ... almost.

Searching the net, I found this post : https://unix.stackexchange.com/question ... ard-drive

Following the poster, I tried, as root, to identify which ACPI interrupt was disabled ma machine, and found, as indicated, one interrup with a huge number : in my case, gpe01

Code: Select all

root@lapolivier:/home/olivier# grep . -r /sys/firmware/acpi/interrupts/
/sys/firmware/acpi/interrupts/gpe3D:       0         invalid      unmasked
/sys/firmware/acpi/interrupts/gpe31:       0         invalid      unmasked
/sys/firmware/acpi/interrupts/gpe2D:       0         invalid      unmasked
/sys/firmware/acpi/interrupts/gpe21:       0         invalid      unmasked
/sys/firmware/acpi/interrupts/gpe1D:       0         invalid      unmasked
/sys/firmware/acpi/interrupts/ff_pwr_btn:       0  EN     enabled      unmasked
/sys/firmware/acpi/interrupts/gpe11:       0         invalid      unmasked
/sys/firmware/acpi/interrupts/gpe0D:       0  EN     enabled      unmasked
/sys/firmware/acpi/interrupts/gpe01: 7147814     STS enabled      unmasked
/sys/firmware/acpi/interrupts/gpe3B:       0         invalid      unmasked
/sys/firmware/acpi/interrupts/gpe2B:       0         invalid      unmasked
/sys

and when I tasted the antidote, it worked, and my machine worked again as advertised.

Code: Select all

root@lapolivier:/home/olivier# echo "disable" > /sys/firmware/acpi/interrupts/gpe01
So I now have an antidote.

But what about a vaccine ?

How could this instruction be given to my machine at start ?

(yes, I tested : it is always the same ACPI Interrupt that starts the fatal kworker process that hogs my CPU)

User avatar
ilu
Posts: 2526
Joined: 09 Oct 2013 12:45

Re: [solved ... and then not !!!] kworker is killing my machine again ...

Postby ilu » 10 Aug 2019 16:21

You can make a bash script with the code and execute that script on startup. Go to Settings - Session and Startup - 3. tab and add the script.

User avatar
amnesix
Posts: 176
Joined: 09 Nov 2013 12:46
Location: Berlin
Contact:

Re: [solved ... and then not !!!] kworker is killing my machine again ...

Postby amnesix » 10 Aug 2019 17:06

It must be under root, or sudo. Doesn't it complicate things ?

User avatar
amnesix
Posts: 176
Joined: 09 Nov 2013 12:46
Location: Berlin
Contact:

Re: [solved ... and then not !!!] kworker is killing my machine again ...

Postby amnesix » 11 Aug 2019 05:51

Time to close this painful chapter of my computer's life :

1) wait until the kworker process starts to hog the CPU

2) under root :

Code: Select all

# grep . -r /sys/firmware/acpi/interrupts/
Then look at the results, and look for a line looks that looks like this :

Code: Select all

/sys/firmware/acpi/interrupts/gpe01: 7147814     STS enabled      unmasked
The hint is the huge number (in this case 714814, where the other lines have «0»). Then note the name of the interrupt. In *my* case, it is gpe01, but it could any other interrupt.

3) under root :

Code: Select all

# echo "disable" > /sys/firmware/acpi/interrupts/gpe01
At this point, your machine should stop acting crazy, and your CPU's workload should be back to normal

4) then it is time to kill the beast preemptively :
write a bash script

Code: Select all

#!/bin/sh
echo "disable" > /sys/firmware/acpi/interrupts/gpe01
again : the name gpe01 is specific for my case. Do not skip 2)
I saved my bash script on my home directory under the evocative name «antikworker.sh», please feel free to be poetic.
Don't forget to give your the execute attribute :

Code: Select all

chmod +x your_bash_script.sh
then, under root, add in the /etc/crontab file the following line :

Code: Select all

@reboot	root	/path_to_your_script.sh
Reboot, and check that when you repeat 2), the line of your offensive acpi interrupt looks like this :

Code: Select all

/sys/firmware/acpi/interrupts/gpe01:       0     STS disabled     unmasked
The hint is the word «disabled»

Thank you for all those who helped, and to the poster of https://unix.stackexchange.com/question ... hard-drive

User avatar
ilu
Posts: 2526
Joined: 09 Oct 2013 12:45

Re: [solved] kworker is killing my machine again ...

Postby ilu » 11 Aug 2019 11:37

Great writeup and I'm glad you finally got it solved!


Return to “Software”

Who is online

Users browsing this forum: No registered users and 7 guests