When I do heavy transfers (500+ mb), regardless of the protocol (NFS,
http, scp, etc), after a non-predictable time the transfers usually freeze. Strangely, however, I can still happily ping the machine and ssh into it -- only the main transfer seems affected. The only remedy seems to be to unload/reload ath_pci. (I haven't seen this problem between any of my non-madwifi machines.) Currently I'm using madwifi-ng-0.9.4.4133.20100621 and linux kernel 2.6.36, although I've had this problem for a while. I'm using the default ath_rate_sample. Perhaps I should try the others? Perhaps I should provide more (rate)stats too :P. It's probably related to http://madwifi-project.org/ticket/1790 ------------------------------------------------------------------------------ The Next 800 Companies to Lead America's Growth: New Video Whitepaper David G. Thomson, author of the best-selling book "Blueprint to a Billion" shares his insights and actions to help propel your business during the next growth cycle. Listen Now! http://p.sf.net/sfu/SAP-dev2dev _______________________________________________ Madwifi-users mailing list [hidden email] https://lists.sourceforge.net/lists/listinfo/madwifi-users |
On Thu, 4 Nov 2010 20:24:37 -0400, Dennis Nezic wrote:
> When I do heavy transfers (500+ mb), regardless of the protocol (NFS, > http, scp, etc), after a non-predictable time the transfers usually > freeze. Strangely, however, I can still happily ping the machine and > ssh into it -- only the main transfer seems affected. The only remedy > seems to be to unload/reload ath_pci. (I haven't seen this problem > between any of my non-madwifi machines.) > > Currently I'm using madwifi-ng-0.9.4.4133.20100621 and linux kernel > 2.6.36, although I've had this problem for a while. > > I'm using the default ath_rate_sample. Perhaps I should try the > others? Perhaps I should provide more (rate)stats too :P. > > It's probably related to http://madwifi-project.org/ticket/1790 None of the ath_rate_* algorithms fix the problem. (Minstrel performs far better than the others, but eventually succumbs.) Here are two athstats, using ath_rate_sample, taken 50minutes apart, during a large continuous nfs file transfer: 422 tx management frames 169236 tx frames discarded due to queue depth 237 tx failed due to too many retries 121886 long on-chip tx retries 65 tx frames with no ack marked 1023541 tx frames with short preamble 21649 tx frames with an alternate rate 20515 rx failed due to bad CRC 108749 PHY errors 4505 OFDM timing 33 OFDM restart 104010 CCK timing 201 CCK restart 385 periodic calibrations rssi of last ack: 26 rssi of last rcv: 29 1 switched default/rx antenna Antenna profile: [1] tx 512925 rx 99313 [2] tx 510788 rx 174 [48minutes later...] 901 tx management frames 263728 tx frames discarded due to queue depth 1053 tx failed due to too many retries 221105 long on-chip tx retries 66 tx frames with no ack marked 1660431 tx frames with short preamble 43636 tx frames with an alternate rate 23781 rx failed due to bad CRC 174894 PHY errors 7282 OFDM timing 33 OFDM restart 167365 CCK timing 214 CCK restart 480 periodic calibrations rssi of last ack: 23 rssi of last rcv: 25 1 switched default/rx antenna Antenna profile: [1] tx 833189 rx 163656 [2] tx 827077 rx 271 ------------------------------------------------------------------------------ Centralized Desktop Delivery: Dell and VMware Reference Architecture Simplifying enterprise desktop deployment and management using Dell EqualLogic storage and VMware View: A highly scalable, end-to-end client virtualization framework. Read more! http://p.sf.net/sfu/dell-eql-dev2dev _______________________________________________ Madwifi-users mailing list [hidden email] https://lists.sourceforge.net/lists/listinfo/madwifi-users |
On Fri, Nov 12, 2010 at 6:09 PM, Dennis Nezic <[hidden email]> wrote:
Is there a reason you aren't using ath5k? Especially with kernel 2.6.36, I imagine ath5k is getting a lot more attention than madwifi. In a deployed setup, I ended up using a madwifi rev somewhere in the 3300's with a proprietary HAL and a handful of patches applied out of OpenWrt. You may want to try an approach like that. Here is a link to OpenWrt's madwifi package: https://dev.openwrt.org/browser/trunk/package/madwifi The patches are there, and the Makefile specifies what rev those are meant to be applied against, but if you're running on PCs you'd definitely want to cherry-pick out of that, because more than a few of those patches are with embedded targets in mind. Still, for a few months I ran madwifi trunk (something between the 4100 and 4200 revs) on my everyday-use laptop and didn't experience anything like you're describing (doing lots of large file transfers). Have you tried running trunk instead of the 0.9.4 branch? Brian Prodoehl ------------------------------------------------------------------------------ Centralized Desktop Delivery: Dell and VMware Reference Architecture Simplifying enterprise desktop deployment and management using Dell EqualLogic storage and VMware View: A highly scalable, end-to-end client virtualization framework. Read more! http://p.sf.net/sfu/dell-eql-dev2dev _______________________________________________ Madwifi-users mailing list [hidden email] https://lists.sourceforge.net/lists/listinfo/madwifi-users |
On Fri, 12 Nov 2010 18:38:02 -0500, Brian Prodoehl wrote:
> On Fri, Nov 12, 2010 at 6:09 PM, Dennis Nezic > <[hidden email]>wrote: > > > On Thu, 4 Nov 2010 20:24:37 -0400, Dennis Nezic wrote: > > > When I do heavy transfers (500+ mb), regardless of the protocol > > > (NFS, http, scp, etc), after a non-predictable time the transfers > > > usually freeze. Strangely, however, I can still happily ping the > > > machine and ssh into it -- only the main transfer seems affected. > > > The only remedy seems to be to unload/reload ath_pci. (I haven't > > > seen this problem between any of my non-madwifi machines.) > > > > > > Currently I'm using madwifi-ng-0.9.4.4133.20100621 and linux > > > kernel 2.6.36, although I've had this problem for a while. > > > > > > I'm using the default ath_rate_sample. Perhaps I should try the > > > others? Perhaps I should provide more (rate)stats too :P. > > > > > > It's probably related to http://madwifi-project.org/ticket/1790 > > > > None of the ath_rate_* algorithms fix the problem. (Minstrel > > performs far better than the others, but eventually succumbs.) > > > > Here are two athstats, using ath_rate_sample, taken 50minutes apart, > > during a large continuous nfs file transfer: > > > > 422 tx management frames > > 169236 tx frames discarded due to queue depth > > 237 tx failed due to too many retries > > 121886 long on-chip tx retries > > 65 tx frames with no ack marked > > 1023541 tx frames with short preamble > > 21649 tx frames with an alternate rate > > 20515 rx failed due to bad CRC > > 108749 PHY errors > > 4505 OFDM timing > > 33 OFDM restart > > 104010 CCK timing > > 201 CCK restart > > 385 periodic calibrations > > rssi of last ack: 26 > > rssi of last rcv: 29 > > 1 switched default/rx antenna > > Antenna profile: > > [1] tx 512925 rx 99313 > > [2] tx 510788 rx 174 > > > > [48minutes later...] > > > > 901 tx management frames > > 263728 tx frames discarded due to queue depth > > 1053 tx failed due to too many retries > > 221105 long on-chip tx retries > > 66 tx frames with no ack marked > > 1660431 tx frames with short preamble > > 43636 tx frames with an alternate rate > > 23781 rx failed due to bad CRC > > 174894 PHY errors > > 7282 OFDM timing > > 33 OFDM restart > > 167365 CCK timing > > 214 CCK restart > > 480 periodic calibrations > > rssi of last ack: 23 > > rssi of last rcv: 25 > > 1 switched default/rx antenna > > Antenna profile: > > [1] tx 833189 rx 163656 > > [2] tx 827077 rx 271 > > > > > > Is there a reason you aren't using ath5k? Especially with kernel > 2.6.36, I imagine ath5k is getting a lot more attention than madwifi. > > In a deployed setup, I ended up using a madwifi rev somewhere in the > 3300's with a proprietary HAL and a handful of patches applied out of > OpenWrt. You may want to try an approach like that. Here is a link > to OpenWrt's madwifi package: > > https://dev.openwrt.org/browser/trunk/package/madwifi > > The patches are there, and the Makefile specifies what rev those are > meant to be applied against, but if you're running on PCs you'd > definitely want to cherry-pick out of that, because more than a few > of those patches are with embedded targets in mind. > > Still, for a few months I ran madwifi trunk (something between the > 4100 and 4200 revs) on my everyday-use laptop and didn't experience > anything like you're describing (doing lots of large file > transfers). Have you tried running trunk instead of the 0.9.4 branch? Hmmm ... I wasn't using ath5k before because it didn't support "managed" (AP) mode ... but it seems to now! I only tried the 0.9.4 (and probably earlier) madwifi-ng branches. It's interesting that you've never experienced this problem. Perhaps it's specific to my AR5001X+ pci card. ------------------------------------------------------------------------------ Centralized Desktop Delivery: Dell and VMware Reference Architecture Simplifying enterprise desktop deployment and management using Dell EqualLogic storage and VMware View: A highly scalable, end-to-end client virtualization framework. Read more! http://p.sf.net/sfu/dell-eql-dev2dev _______________________________________________ Madwifi-users mailing list [hidden email] https://lists.sourceforge.net/lists/listinfo/madwifi-users |
> From: "Dennis Nezic" <[hidden email]>
[ snip - to save space ]
> To: [hidden email] > Cc: [hidden email] > Sent: Friday, November 12, 2010 4:56:02 PM > Subject: Re: [Madwifi-users] AR5001X+ connections freeze on heavy transfers > > On Fri, 12 Nov 2010 18:38:02 -0500, Brian Prodoehl wrote: > > On Fri, Nov 12, 2010 at 6:09 PM, Dennis Nezic > > <[hidden email]>wrote: > > > > > On Thu, 4 Nov 2010 20:24:37 -0400, Dennis Nezic wrote: > > > > When I do heavy transfers (500+ mb), regardless of the protocol > > > > (NFS, http, scp, etc), after a non-predictable time the transfers > > > > usually freeze. Strangely, however, I can still happily ping the > > > > machine and ssh into it -- only the main transfer seems affected. > > > > The only remedy seems to be to unload/reload ath_pci. (I haven't > > > > seen this problem between any of my non-madwifi machines.) > > > > > > > > Currently I'm using madwifi-ng-0.9.4.4133.20100621 and linux > > > > kernel 2.6.36, although I've had this problem for a while. > > > > > > > > I'm using the default ath_rate_sample. Perhaps I should try the > > > > others? Perhaps I should provide more (rate)stats too :P. > > > > > > > > It's probably related to http://madwifi-project.org/ticket/1790 > > > > > > None of the ath_rate_* algorithms fix the problem. (Minstrel > > > performs far better than the others, but eventually succumbs.) > > > > > Hmmm ... I wasn't using ath5k before because it didn't support > "managed" (AP) mode ... but it seems to now! I only tried the 0.9.4 > (and probably earlier) madwifi-ng branches. > > It's interesting that you've never experienced this problem. Perhaps > it's specific to my AR5001X+ pci card. > Has anyone found a solution to this issue? I am having a very similar problem with using the same type of card but with 2.6.31.12-rt21 kernel I have one VAP configured in STA mode and whenever I try to transfer data from the wifi connected machine, it completely freezes. No Oops is generated and the machine appears to be completely locked up requiring a power off/on cycle. The weird thing is, going the other way seems to work a little better and will complete without crashing most of the time. For example: If I copy from the machine with the wifi card to another machine that is ethernet connected it freezes the wifi connected machine. # scp /home/user/mybigfile.tar.gz user@ethernet_machine:/tmp/ if I start the copy from the ethernet connected machine and pull the data from the wifi connected machine, it works most of the time. # scp user@wifi_machine:/home/user/mybigfile.tar.gz /tmp/ I tried the usual suggestions, turing off bgscan, using 802.11b only but still no luck. The most recent snapshot does not resolve the issue either. I did recompile the driver with ATH_CAP_SUPERG_FF=0 in BuildCaps.inc. This stopped the machine from completely but still can't transfer large file without crashing. It did stop the machine from completely freezing and an Oops was generated to the console. Here is some extra info and the Oops from dmesg. If anyone has any ideas it would be much appreciated. BTW, I tried the ath5k driver from 2.6.31.12 kernel and it does not seem to work any better. Thanks, Tom ----- # lspci 02:05.0 Ethernet controller: Atheros Communications Inc. Atheros AR5001X+ Wireless Network Adapter (rev 01) # uname -a Linux (none) 2.6.31.12-rt21 #8 PREEMPT RT Wed Aug 25 17:43:08 MDT 2010 i686 unknown # dmesg [ 3.647620] wlan: svn r4133 (branch madwifi-0.9.4) [ 3.670550] ath_hal: module license 'Proprietary' taints kernel. [ 3.670650] Disabling lock debugging due to kernel taint [ 3.675033] ath_hal: 0.9.18.0 (AR5210, AR5211, AR5212, RF5111, RF5112, RF2413, RF5413) [ 3.708676] ath_pci: svn r4133 (branch madwifi-0.9.4) [ 3.708838] ath_pci 0000:02:05.0: PCI INT A -> GSI 18 (level, low) -> IRQ 18 [ 4.167390] ath_rate_sample: 1.2 (svn r4133 (branch madwifi-0.9.4)) [ 4.168883] wifi0: 11a rates: 6Mbps 9Mbps 12Mbps 18Mbps 24Mbps 36Mbps 48Mbps 54Mbps [ 4.169314] wifi0: 11b rates: 1Mbps 2Mbps 5.5Mbps 11Mbps [ 4.169558] wifi0: 11g rates: 1Mbps 2Mbps 5.5Mbps 11Mbps 6Mbps 9Mbps 12Mbps 18Mbps 24Mbps 36Mbps 48Mbps 54Mbps [ 4.170116] wifi0: turboA rates: 6Mbps 9Mbps 12Mbps 18Mbps 24Mbps 36Mbps 48Mbps 54Mbps [ 4.170529] wifi0: turboG rates: 6Mbps 12Mbps 18Mbps 24Mbps 36Mbps 48Mbps 54Mbps [ 4.170908] wifi0: H/W encryption support: WEP AES AES_CCM TKIP [ 4.171164] wifi0: mac 5.9 phy 4.3 radio 3.6 [ 4.171313] wifi0: Use hw queue 1 for WME_AC_BE traffic [ 4.171397] wifi0: Use hw queue 0 for WME_AC_BK traffic [ 4.171483] wifi0: Use hw queue 2 for WME_AC_VI traffic [ 4.171568] wifi0: Use hw queue 3 for WME_AC_VO traffic [ 4.171652] wifi0: Use hw queue 8 for CAB traffic [ 4.171735] wifi0: Use hw queue 9 for beacons [ 4.171820] wifi0: Atheros 5212: mem=0xdfef0000, irq=18 # cat /proc/interrupts CPU0 0: 136 IO-APIC-edge timer 1: 8 IO-APIC-edge i8042 3: 2 IO-APIC-edge 4: 2 IO-APIC-edge 6: 2 IO-APIC-edge 7: 2 IO-APIC-edge 8: 0 IO-APIC-edge rtc0 9: 0 IO-APIC-fasteoi acpi 10: 2 IO-APIC-edge 12: 3 IO-APIC-edge 14: 0 IO-APIC-edge ide0 15: 1227 IO-APIC-edge ide1 17: 14511 IO-APIC-fasteoi eth0, HDA Intel 18: 173993 IO-APIC-fasteoi wifi0 20: 0 IO-APIC-fasteoi uhci_hcd:usb2 21: 0 IO-APIC-fasteoi uhci_hcd:usb4 22: 0 IO-APIC-fasteoi uhci_hcd:usb3 23: 0 IO-APIC-fasteoi ehci_hcd:usb1 NMI: 0 Non-maskable interrupts LOC: 2946323 Local timer interrupts SPU: 0 Spurious interrupts CNT: 0 Performance counter interrupts PND: 0 Performance pending work TRM: 0 Thermal event interrupts MCE: 0 Machine check exceptions MCP: 0 Machine check polls ERR: 0 MIS: 0 # Oops from dmesg [ 683.154455] BUG: unable to handle kernel NULL pointer dereference at 00000010 [ 683.154538] IP: [<dc97d079>] ath_tx_start+0x159/0x13b0 [ath_pci] [ 683.154632] *pde = 00000000 [ 683.154683] Oops: 0002 [#1] PREEMPT [ 683.154737] last sysfs file: /sys/devices/pci0000:00/0000:00:10.2/usb4/devnum [ 683.154800] Modules linked in: wlan_wep wlan_ccmp wlan_xauth af_packet usbhid hid usb_storage rtc_cmos snd_pcm_oss snd_mixer_oss snd_hda_intel snd_hda_codec snd_pcm snd_page_alloc snd_timer snd dscudkp e1000 wlan_scan_sta wlan_scan_ap ath_rate_sample ath_pci ath_hal(P) wlan [ 683.155004] [ 683.155004] Pid: 5, comm: sirq-net-tx/0 Tainted: P (2.6.31.12-rt21 #8) CX700 [ 683.155004] EIP: 0060:[<dc97d079>] EFLAGS: 00010292 CPU: 0 [ 683.155004] EIP is at ath_tx_start+0x159/0x13b0 [ath_pci] [ 683.155004] EAX: 19e4f0a0 EBX: c1357560 ECX: 000000a0 EDX: 00000000 [ 683.155004] ESI: 00000000 EDI: 00000004 EBP: dbbea320 ESP: db857d9c [ 683.155004] DS: 007b ES: 007b FS: 0000 GS: 0000 SS: 0068 preempt:00000000 [ 683.155004] Process sirq-net-tx/0 (pid: 5, ti=db856000 task=db845b20 task.ti=db856000) [ 683.155004] Stack: [ 683.155004] 00000608 00000001 00000000 dae8da28 00000000 daee6480 00000000 dc8ce1db [ 683.155004] <0> 00000000 00000000 00000000 00000000 00000000 db857e3a db845104 00000000 [ 683.155004] <0> c101f64d c103c731 00000000 dae2b000 dbbea000 00e2b000 000005e4 dbb41320 [ 683.155004] Call Trace: [ 683.155004] [<dc8ce1db>] ? ieee80211_encap+0x23b/0xc60 [wlan] [ 683.155004] [<c101f64d>] ? enqueue_task_rt+0x2d/0x1a0 [ 683.155004] [<c103c731>] ? sched_clock_cpu+0x51/0x300 [ 683.155004] [<c1093b48>] ? seq_escape+0x38/0x110 [ 683.155004] [<dc982fd8>] ? ath_hardstart+0x1e8/0xc00 [ath_pci] [ 683.155004] [<c101f64d>] ? enqueue_task_rt+0x2d/0x1a0 [ 683.155004] [<c103c731>] ? sched_clock_cpu+0x51/0x300 [ 683.155004] [<c101fde6>] ? update_curr+0x146/0x1b0 [ 683.155004] [<c1020996>] ? wake_up_process+0x16/0x20 [ 683.155004] [<c104dee9>] ? handle_IRQ_event+0x49/0xa0 [ 683.155004] [<c1014f1f>] ? ack_apic_level+0x3f/0x180 [ 683.155004] [<c1236502>] ? dev_hard_start_xmit+0x212/0x2a0 [ 683.155004] [<c124450d>] ? __qdisc_run+0x1fd/0x260 [ 683.155004] [<c12a2663>] ? __schedule+0x1f3/0x3e0 [ 683.155004] [<c1232fe1>] ? net_tx_action+0xa1/0xd0 [ 683.155004] [<c1027fe0>] ? ksoftirqd+0xf0/0x1f0 [ 683.155004] [<c1027ef0>] ? ksoftirqd+0x0/0x1f0 [ 683.155004] [<c1036adc>] ? kthread+0x7c/0x90 [ 683.155004] [<c1036a60>] ? kthread+0x0/0x90 [ 683.155004] [<c10037a7>] ? kernel_thread_helper+0x7/0x10 [ 683.155004] Code: c1 eb 0c c1 e3 05 83 e2 fc 01 da bb 01 00 00 00 89 5c 24 04 8b 9c 24 9c 00 00 00 89 1c 24 8b 1d c4 78 3c c1 ff 53 08 8b 54 24 48 <89> 42 10 f6 85 2c 0f 00 00 01 0f 85 02 0c 00 00 8b 5c 24 48 8b [ 683.155004] EIP: [<dc97d079>] ath_tx_start+0x159/0x13b0 [ath_pci] SS:ESP 0068:db857d9c [ 683.155004] CR2: 0000000000000010 [ 683.158132] ---[ end trace c6e982bd8cd7c18f ]--- [ 684.012034] ------------[ cut here ]------------ [ 684.012130] Kernel BUG at c10217d8 [verbose debug info unavailable] [ 684.012221] invalid opcode: 0000 [#2] PREEMPT [ 684.012366] last sysfs file: /sys/devices/pci0000:00/0000:00:10.2/usb4/devnum [ 684.012458] Modules linked in: wlan_wep wlan_ccmp wlan_xauth af_packet usbhid hid usb_storage rtc_cmos snd_pcm_oss snd_mixer_oss snd_hda_intel snd_hda_codec snd_pcm snd_page_alloc snd_timer snd dscudkp e1000 wlan_scan_sta wlan_scan_ap ath_rate_sample ath_pci ath_hal(P) wlan [ 684.013001] [ 684.013001] Pid: 11, comm: sirq-rcu/0 Tainted: P D (2.6.31.12-rt21 #8) CX700 [ 684.013001] EIP: 0060:[<c10217d8>] EFLAGS: 00010246 CPU: 0 [ 684.013001] EIP is at __put_task_struct+0x68/0xb0 [ 684.013001] EAX: 00000000 EBX: db845b20 ECX: 00000003 EDX: db840640 [ 684.013001] ESI: db845e80 EDI: fffffeff EBP: 00000000 ESP: db863f78 [ 684.013001] DS: 007b ES: 007b FS: 0000 GS: 0000 SS: 0068 preempt:00000000 [ 684.013001] Process sirq-rcu/0 (pid: 11, ti=db862000 task=db849440 task.ti=db862000) [ 684.013001] Stack: [ 684.013001] 00000000 c10515b7 00000100 c135aec0 c1027fe0 0000000b 00000001 db863f98 [ 684.013001] <0> 00000031 db84ff50 c135aec0 c1027ef0 00000000 c1036adc 00000000 00000000 [ 684.013001] <0> db863fb8 db863fb8 db863fc0 db863fc0 00000000 00000000 db863fd0 db863fd0 [ 684.013001] Call Trace: [ 684.013001] [<c10515b7>] ? rcu_process_callbacks+0x47/0x80 [ 684.013001] [<c1027fe0>] ? ksoftirqd+0xf0/0x1f0 [ 684.013001] [<c1027ef0>] ? ksoftirqd+0x0/0x1f0 [ 684.013001] [<c1036adc>] ? kthread+0x7c/0x90 [ 684.013001] [<c1036a60>] ? kthread+0x0/0x90 [ 684.013001] [<c10037a7>] ? kernel_thread_helper+0x7/0x10 [ 684.013001] Code: 8b 93 e0 01 00 00 8b 02 85 c0 7e 18 ff 0a 0f 94 c0 84 c0 74 07 89 d0 e8 f7 b4 01 00 89 d8 5b e9 6f ff ff ff 0f 0b eb fe 8d 76 00 <0f> 0b eb fe 8d 74 26 00 ba a8 00 00 00 b8 4c d8 2f c1 e8 f1 1e [ 684.013001] EIP: [<c10217d8>] __put_task_struct+0x68/0xb0 SS:ESP 0068:db863f78 [ 684.018599] ---[ end trace c6e982bd8cd7c190 ]--- ------------------------------------------------------------------------------ Simplify data backup and recovery for your virtual environment with vRanger. Installation's a snap, and flexible recovery options mean your data is safe, secure and there when you need it. Data protection magic? Nope - It's vRanger. Get your free trial download today. http://p.sf.net/sfu/quest-sfdev2dev _______________________________________________ Madwifi-users mailing list [hidden email] https://lists.sourceforge.net/lists/listinfo/madwifi-users |
Free forum by Nabble | Edit this page |