|
wviewweather.com wview and Weather Topics
|
View previous topic :: View next topic |
Author |
Message |
AirBourn
Joined: 25 Dec 2005 Posts: 9 Location: Jordan, MN
|
Posted: Sun Dec 25, 2005 12:30 pm Post subject: BAD Archive Records |
|
|
Wview 1.9.1, SLAB_USBtoUART driver on MacOS 10.3.9 talking to USB Weather Envoy, listening to VP signal off the Davis wireless repeater. Works well for long periods of time, and then get sporadic periods of BAD Archive Records in the system.log.
Can this be attributed solely to bad/flakey reception of the repeated signal? Or, might there be some USB/driver communications trouble there? The station is just a hundred feet around a metal building or so, and the Envoy is 20 feet from the repeater, through a building. Although it's the solar wireless repeater and subject to some power fluctuations, I have a hard time blaming signal strength alone for these bad archive records.
Any similar experiences or ideas?
Bryan
Jordan, MN |
|
Back to top |
|
|
mteel
Joined: 30 Jun 2005 Posts: 435 Location: Collinsville, TX
|
Posted: Mon Dec 26, 2005 10:26 am Post subject: |
|
|
I included the User Manual section "20.4 Wireless Reception" to address this type of problem. Follow the instructions there to look at your rxcheck.png chart to see if your wireless reception is bad...
Make sure you have it enabled in wview.conf.
Mark |
|
Back to top |
|
|
AirBourn
Joined: 25 Dec 2005 Posts: 9 Location: Jordan, MN
|
Posted: Mon Dec 26, 2005 12:44 pm Post subject: |
|
|
Yep, I'm familiar with the rxcheck graph, and first turned to that to begin to address the problem. What I find there is a very distinct pattern of good reception (90%+)followed by a spike of poor reception at regular intervals. Looking at the code, it appears that this occurs when the number of errors or bad packets exceeds some integer limitation (32767 or 65535), causing the percentage calculation to return an unusable number.
Overall, the reception appears to be quite good, and the graph will appear this way during periods of good and bad archive records. Also, as strange as it seems, I seems to notice more bad archive records during times of little or no wind. The signal reception and archive records seem to be the best when it's windy. Makes me think about marking an archive record bad when it sees a zero, or maybe the VP station is sending noisy data when its calm, or wview isn't expecting a zero or "---" or something along those lines. It just appears to be more than intermittent poor reception. Especially considering I'm just a hundred feet from the station... |
|
Back to top |
|
|
mteel
Joined: 30 Jun 2005 Posts: 435 Location: Collinsville, TX
|
Posted: Mon Dec 26, 2005 3:25 pm Post subject: |
|
|
Unless you are running on a 16 bit machine, an integer has a maximum positive limit of 2,147,483,647 - that's 2 billion +. The calculations are done with integers as you know if you examined the code.
Further, see http://home.comcast.net/~mullicahillweather/img/rxcheck.png -
that is a wireless station that does not have these periodic "rollovers"...
Further, another guy using an envoy was having a similar periodic problem with reception.
Your graph is not displaying "unusable values", but periodic reception problems. An unusable value would not appear as 5%...
Bad RF reception is not just caused by wind or rain - many other things can come into play.
wview declares an archive record as bad if the temperature is set to 0x7FFF, which is the invalid value I have observed when my wireless unit's battery was low or had interference for an archive interval.
If you had a VP console you would be able to use it to display the RX stats (and all the errors) since you don't appear to trust the wview results. |
|
Back to top |
|
|
AirBourn
Joined: 25 Dec 2005 Posts: 9 Location: Jordan, MN
|
Posted: Mon Dec 26, 2005 6:31 pm Post subject: |
|
|
Yeah, I'm familiar with RF and the usable percentage values - "Unusable" was a poor choice of words, sorry. Just pointing out that the reception spikes don't appear to be the cause of the bad archive records. The bad archive records don't correlate in time with the low points on the graph trace.
I see the periodic poor reception spikes all the time - even when the archive records are all good. I see the bad archive records frequently when the values for wind are small. Purely an observation.
If you wish, you can watch both my rxcheck graph ( http://67.137.107.234/WX/rxcheck.png) as well as a graph of the data out of the Envoy ( http://67.137.107.234/StatusGraph.php ). When I have a bad archive record, I draw a red box with an X through the wind trace, as well as an X across the other temperature values out of the Envoy. At the moment (12/26 18:25 CDT) they're all good - about 10 hours worth. There doesn't appear to be any correlation between the low reception spikes (they're always there) and bad archive records. |
|
Back to top |
|
|
mteel
Joined: 30 Jun 2005 Posts: 435 Location: Collinsville, TX
|
Posted: Mon Dec 26, 2005 6:43 pm Post subject: |
|
|
Bad archive records are received when there was insufficient data during an archive interval to generate a valid archive record. The correlation is that because your reception is much worse than normally observed at certain points in time, it will occasionally result in a bad archive record. It is not unrelated.
I am not at all surprised that your RX quality graph isn't "90%+" except when you get a bad archive record. That is not a real-life scenario. Somehow you have semi-periodic interference that occasionally rises to the level of causing a bad archive record.
You can add a debug message where the rxcheck calculations are done and print out the raw rxcheck response from the envoy - which will include the error values. I wonder if the envoys are succeptible to interference.
Fundamentally if the outside temp value is 0x7FFF in an archive record, it is a "no value" archive record. Wind is not part of that decision procedure. You might want to ask Davis why you get so many bad archive records with your envoy. |
|
Back to top |
|
|
AirBourn
Joined: 25 Dec 2005 Posts: 9 Location: Jordan, MN
|
Posted: Thu Dec 29, 2005 11:39 am Post subject: |
|
|
Stuffed some debug messages into vpinterface-be.c to look at how the packet receive percentages are calculated. Here's a result chunk that shows one of the poor spikes. There are three calculations shown, five minutes apart (archive interval), at 100%, 3%, and 100%.
Quote: |
12/29/05 09:40:05 --------------------------------
12/29/05 09:40:05 Original String: '13508 185 1 724 28821'
12/29/05 09:40:05 rxGood set to 13508 from '13508'
12/29/05 09:40:05 rxMiss set to 185 from '185'
12/29/05 09:40:05 rxCRC set to 28821 from '28821'
12/29/05 09:40:05 Looking at rxGood...
12/29/05 09:40:05 tempint set to 120
12/29/05 09:40:05 rxGood now set to 120
12/29/05 09:40:05 Looking at rxMiss...
12/29/05 09:40:05 tempint set to 0
12/29/05 09:40:05 rxMiss now set to 0
12/29/05 09:40:05 Looking at rxCRC...
12/29/05 09:40:05 tempint set to 0
12/29/05 09:40:05 rxCRC now set to 0
12/29/05 09:40:05 tempint set to 120+0+0 = 120
12/29/05 09:40:05 Percent calculated (100 * 120) / 120 = 100
12/29/05 09:45:05 --------------------------------
12/29/05 09:45:05 Original String: '13603 185 1 724 31859'
12/29/05 09:45:05 rxGood set to 13603 from '13603'
12/29/05 09:45:05 rxMiss set to 185 from '185'
12/29/05 09:45:05 rxCRC set to 31859 from '31859'
12/29/05 09:45:05 Looking at rxGood...
12/29/05 09:45:05 tempint set to 95
12/29/05 09:45:05 rxGood now set to 95
12/29/05 09:45:05 Looking at rxMiss...
12/29/05 09:45:05 tempint set to 0
12/29/05 09:45:05 rxMiss now set to 0
12/29/05 09:45:05 Looking at rxCRC...
12/29/05 09:45:05 tempint set to 3038
12/29/05 09:45:05 rxCRC now set to 3038
12/29/05 09:45:05 tempint set to 95+0+3038 = 3133
12/29/05 09:45:05 Percent calculated (100 * 95) / 3133 = 3
12/29/05 09:50:05 --------------------------------
12/29/05 09:50:05 Original String: '13723 185 1 724 31859'
12/29/05 09:50:05 rxGood set to 13723 from '13723'
12/29/05 09:50:05 rxMiss set to 185 from '185'
12/29/05 09:50:05 rxCRC set to 31859 from '31859'
12/29/05 09:50:05 Looking at rxGood...
12/29/05 09:50:05 tempint set to 120
12/29/05 09:50:05 rxGood now set to 120
12/29/05 09:50:05 Looking at rxMiss...
12/29/05 09:50:05 tempint set to 0
12/29/05 09:50:05 rxMiss now set to 0
12/29/05 09:50:05 Looking at rxCRC...
12/29/05 09:50:05 tempint set to 0
12/29/05 09:50:05 rxCRC now set to 0
12/29/05 09:50:05 tempint set to 120+0+0 = 120
12/29/05 09:50:05 Percent calculated (100 * 120) / 120 = 100
|
It appears to calculate a poor percentage when there's a large increase in rxCRC. In the 3% case above, the increase in rxCRC is calculated, tempint is set to a large delta, 3038, and then stuffed back into the work structure for next time. The 3038 value is then included in the sum of rxGood, rxMiss, and rxCRC: 3133. The 3133 is then used as the divisor for the percentage, resulting in (100 * 95) / 3133 = 3%. In my case here with the Envoy, it appears that the spikes are caused by periodic large increases in rxCRC. Does that make sense?
Also, I found some information in the docs (VTECHREF.txt section 6) about "Calculating ISS Reception". It describes a method of determining the quality of reception by tracking the number of wind samples contained.
Quote: |
6. Calculating ISS reception
The "Number of Wind Samples" field in the archive record can tell
you the quality of radio communication between the ISS (or
wireless anemometer) and the Vantage Pro console because wind
speed data is send in almost all data packets. In order to use
this, you need to know how many packets you could have gotten if
you had 100 % reception. This is a function of both the archive
interval and the transmitter ID that is sending wind speed.
The formula for determining the expected maximum number of
packets containing wind speed is:
Here archive_interval_min is the archive interval in minutes and
ID is the transmitter ID number between 1 and 8.
It is possible for the number of wind samples to be larger than
the "expected" maximum value. This is because the maximum value
is a long term average, rounded to an integer. The WeatherLink
program displays 100% in these cases (i.e. not the 105% that the
math would suggest).
|
I see that you're storing off the number of wind samples in the arcRecord in dbfiles.c, but I don't readily see if you're using that information in determining the validity of the acrRecord or the communications. Might that possibly explain why I see more bad archive records during times of little or no wind? Maybe the Envoy handles them differently? Just brainstorming...
P.S. Even though I've got a custom setup here, and am seeing some strange and undesireable behavior on this datastream, wview is still a waaaay better solution than running the WeatherLink package 24/7 - and many thanks to you for that! |
|
Back to top |
|
|
mteel
Joined: 30 Jun 2005 Posts: 435 Location: Collinsville, TX
|
Posted: Thu Dec 29, 2005 12:20 pm Post subject: |
|
|
It concerns me that your number of CRC errors is so high - I believe 120 in a 5 minute period is the maximum number of possible packets (every 2.5 secs) so how you get 3038 CRC errors in 5 minutes is goofy. This may definitely be an envoy thing. You might want to shoot Davis a question about what is the meaning of the RXCHECK fields for an envoy and if it differs from the VP/VP2. I would also ask why you periodically receive archive records with a temp value of 0x7FFF on the envoy.
Yes, I am aware of their method for calculating ISS quality but the formula looks a bit cheesy to me and the protocol provides a direct command to retrieve the info - that is why I used the RXCHECK command method.
Even if the wind is calm that should not effect the number of wind samples your ISS collects. A zero reading for velocity is a valid reading.
Mark |
|
Back to top |
|
|
AirBourn
Joined: 25 Dec 2005 Posts: 9 Location: Jordan, MN
|
Posted: Thu Dec 29, 2005 1:15 pm Post subject: |
|
|
Yep, concerns me too. Doesn't make much sense. I sent a note to Davis support - we'll see if they get back to me. Do you think it's possible that the RXCHECK fields have a slightly different definition on the Envoy? Another tidbit of information there: I do see large negative values in that CRC field, too. That also seems strange. Here's a list of the RXCHECK strings received over 3-4 hours:
Quote: |
12/29/05 08:45:06 RXCHECK String: '12205 168 1 724 28804' 24%
12/29/05 08:50:05 RXCHECK String: '12322 171 1 724 28807' 95%
12/29/05 08:55:05 RXCHECK String: '12441 172 1 724 28808' 98%
12/29/05 09:00:05 RXCHECK String: '12559 174 1 724 28810' 96%
12/29/05 09:05:06 RXCHECK String: '12679 174 1 724 28810' 100%
12/29/05 09:10:05 RXCHECK String: '12795 178 1 724 28814' 93%
12/29/05 09:15:05 RXCHECK String: '12914 179 1 724 28815' 98%
12/29/05 09:20:05 RXCHECK String: '13033 180 1 724 28816' 98%
12/29/05 09:25:06 RXCHECK String: '13148 185 1 724 28821' 92%
12/29/05 09:30:05 RXCHECK String: '13268 185 1 724 28821' 100%
12/29/05 09:35:06 RXCHECK String: '13388 185 1 724 28821' 100%
12/29/05 09:40:05 RXCHECK String: '13508 185 1 724 28821' 100%
12/29/05 09:45:05 RXCHECK String: '13603 185 1 724 31859' 3%
12/29/05 09:50:05 RXCHECK String: '13723 185 1 724 31859' 100%
12/29/05 09:55:05 RXCHECK String: '13843 185 1 724 31859' 100%
12/29/05 10:00:05 RXCHECK String: '13963 185 1 829 31859' 100%
12/29/05 10:05:05 RXCHECK String: '14083 185 1 949 31859' 100%
12/29/05 10:10:05 RXCHECK String: '14197 191 1 967 31865' 90%
12/29/05 10:15:05 RXCHECK String: '14314 194 1 967 31868' 95%
12/29/05 10:20:05 RXCHECK String: '14426 202 1 967 31876' 87%
12/29/05 10:25:05 RXCHECK String: '14542 206 1 967 31880' 93%
12/29/05 10:30:05 RXCHECK String: '14658 210 1 967 31884' 93%
12/29/05 10:35:05 RXCHECK String: '14772 216 1 967 31890' 90%
12/29/05 10:40:05 RXCHECK String: '14891 217 1 967 31891' 98%
12/29/05 10:45:05 RXCHECK String: '14968 218 1 967 -28571' Not Calculated
12/29/05 10:50:05 RXCHECK String: '15064 218 1 967 -25511' 3%
12/29/05 10:55:05 RXCHECK String: '15183 219 1 967 -25510' 98%
12/29/05 11:00:05 RXCHECK String: '15302 220 1 967 -25509' 98%
12/29/05 11:05:05 RXCHECK String: '15422 220 1 967 -25509' 100%
12/29/05 11:10:05 RXCHECK String: '15540 222 1 967 -25507' 96%
12/29/05 11:15:05 RXCHECK String: '15659 223 1 967 -25506' 98%
12/29/05 11:20:05 RXCHECK String: '15779 223 1 967 -25506' 100%
12/29/05 11:25:05 RXCHECK String: '15899 223 1 967 -25506' 100%
12/29/05 11:30:05 RXCHECK String: '16017 225 1 967 -25504' 96%
12/29/05 11:35:04 RXCHECK String: '16137 225 1 967 -25504' 100%
12/29/05 11:40:05 RXCHECK String: '16256 226 1 967 -25503' 98%
12/29/05 11:45:04 RXCHECK String: '16351 227 1 967 -22558' 3%
12/29/05 11:50:04 RXCHECK String: '16470 228 1 967 -22557' 98%
12/29/05 11:55:05 RXCHECK String: '16589 229 1 967 -22556' 98%
12/29/05 12:00:05 RXCHECK String: '16706 232 1 967 -22553' 95%
12/29/05 12:05:04 RXCHECK String: '16755 285 2 967 -20097' 1%
12/29/05 12:10:06 RXCHECK String: '16846 312 3 967 -19616' 15%
12/29/05 12:15:05 RXCHECK String: '16963 315 3 967 -19613' 95%
12/29/05 12:20:04 RXCHECK String: '17083 315 3 967 -19613' 100%
12/29/05 12:25:05 RXCHECK String: '17202 316 3 967 -19612' 98%
|
It's interesting - at the 12/29/05 10:45:05 record, the last field goes hugely negative. This is what I saw that lead me to believe that it was some sort of integer limit problem. Why else would this value go from 31891 to -28571? The code assumes this is some sort of reset, and does not perform the percent calculation, but does store off the value -60462 in the work structure. Then, the next record period, the delta CRC value is calculated as 3060 CRC errors, and results in a 3% calculation.
Might be a different field definition on the Envoy. 3060 CRC errors in five minutes doesn't make sense, and a negative value for CRC errors doesn't make sense. |
|
Back to top |
|
|
mteel
Joined: 30 Jun 2005 Posts: 435 Location: Collinsville, TX
|
Posted: Thu Dec 29, 2005 1:27 pm Post subject: |
|
|
Yeah, that's kinda what I'm thinking too... |
|
Back to top |
|
|
AirBourn
Joined: 25 Dec 2005 Posts: 9 Location: Jordan, MN
|
Posted: Mon Jan 02, 2006 11:18 pm Post subject: |
|
|
Well, still no word from Davis. Also, another observation. As strange as it seems, the wind has been calm here most of the day, and I've received nothing but bad archive records for the last 5 hours. Two good ones in that entire period. Nothing else in the receiver setup has changed. I look over at the anemometer, and it's not moving in the slightest. Only the vane wiggles a touch now and then. I'm going to have to look into what the Envoy is sending wview during this condition.... |
|
Back to top |
|
|
AirBourn
Joined: 25 Dec 2005 Posts: 9 Location: Jordan, MN
|
Posted: Wed Jan 04, 2006 11:10 am Post subject: |
|
|
The only thing Davis had to say was to suggest that there's some 900MHz interference in the area. Doesn't sound all that plausible to me. I occasionally see bad archive records for hours and hours, and sometimes overnight. I have 2.4 and 5.8GHz transmitters in the area, but nothing on 900MHz. And, I highly doubt my neighbor is sitting on his 900MHz phone for 8 hours over night. Possible, but not probable. I asked them to further clarify the definition of the RXCHECK fields for the Envoy, but haven't heard back on that one. |
|
Back to top |
|
|
AirBourn
Joined: 25 Dec 2005 Posts: 9 Location: Jordan, MN
|
Posted: Fri Jan 06, 2006 1:47 pm Post subject: |
|
|
Davis had this to say:
Quote: | RXCHECK is the same on the Envoy. The last field is the CRC error count. 7FFF is no data or a bad sensor. Basically no temp info. |
I asked further about negative values and huge increases in CRC. |
|
Back to top |
|
|
AirBourn
Joined: 25 Dec 2005 Posts: 9 Location: Jordan, MN
|
Posted: Thu Jan 12, 2006 7:08 pm Post subject: |
|
|
Brett at Davis had this to say about negative values for the CRC field, and large increases in CRC errors:
Quote: |
The "negative" values come because while the number on the console is an
unsigned 2-byte value, the function that converts a number into a string is
designed for signed numbers.
When you see a negative value, add 65536 to determine the correct value.
The large number of CRCs is likely due to some RF radiator swamping the
console receiver circuit. Cordless phone, cell phones, baby monitors etc.
|
|
|
Back to top |
|
|
mteel
Joined: 30 Jun 2005 Posts: 435 Location: Collinsville, TX
|
Posted: Thu Jan 12, 2006 7:39 pm Post subject: |
|
|
I just added the logic for negative CRC counts - will be in the 3.0.0 release soon.
That still doesn't explain why you are getting so many...
Thanks for the updates. |
|
Back to top |
|
|
|
|
You cannot post new topics in this forum You cannot reply to topics in this forum You cannot edit your posts in this forum You cannot delete your posts in this forum You cannot vote in polls in this forum
|
Powered by phpBB © 2001, 2005 phpBB Group
|