This patch does a couple of things. It adds a new DEBUG mode where packet
statistics are printed when channels are closed which can be used to track where
packets might be lost in the transcoding chain.
This patch will also print to the kernel log if the AN983 has detected any
errored received packets. Problems of this type are typically system problems,
like when the card is having trouble DMAing packets.
Internal-Issue-ID: DAHDI-1071
Signed-off-by: Shaun Ruffell <sruffell@digium.com>
Signed-off-by: Russ Meyerriecks <rmeyerriecks@digium.com>
Keeping the transmit descriptor ring shorter reduces the time it takes to send
CSM_ENCAP commands to the transcoding engine when the card is otherwise busy.
Signed-off-by: Shaun Ruffell <sruffell@digium.com>
Signed-off-by: Russ Meyerriecks <rmeyerriecks@digium.com>
It is only modified under the chanlock anyway.
Signed-off-by: Shaun Ruffell <sruffell@digium.com>
Signed-off-by: Russ Meyerriecks <rmeyerriecks@digium.com>
Simplifies the logic when polling is enabled. No need to worry about any system
factors when scheduling the default kernel timer.
Signed-off-by: Shaun Ruffell <sruffell@digium.com>
Signed-off-by: Russ Meyerriecks <rmeyerriecks@digium.com>
Otherwise, if there are many RTP commands queued on the command list, some of
the CSM_ENCAP packets, like ACKS, weren't being sent to the firmware within the
timeout value.
Signed-off-by: Shaun Ruffell <sruffell@digium.com>
Signed-off-by: Russ Meyerriecks <rmeyerriecks@digium.com>
Not only does this make it atomic when moving commands from the
waiting_for_response_list to the command_list if the descriptor is full, it will
also make the entire process of submitting a packet the packet transmission
logic atomic.
Signed-off-by: Shaun Ruffell <sruffell@digium.com>
Signed-off-by: Russ Meyerriecks <rmeyerriecks@digium.com>
This was more a holdover when the AN983 interface was brought over from the
voicebus driver.
Signed-off-by: Shaun Ruffell <sruffell@digium.com>
Signed-off-by: Russ Meyerriecks <rmeyerriecks@digium.com>
When we start the shutdown sequence for a channel, there is no need to submit
any RTP packets that are queued on the command list. Under extreme load with
many backed up RTP packets it was possible to have RTP packets submitted after
the channel shutdown process started.
Signed-off-by: Shaun Ruffell <sruffell@digium.com>
Signed-off-by: Russ Meyerriecks <rmeyerriecks@digium.com>
A small percentage of the total packets sent to the DTE ever wait for
completions. This will save on the need to keep the completion around in all the
packets.
Also, since we can use the presence of the completion as the flag whether we
intend to auto free, we can simplify the flags as well.
Signed-off-by: Shaun Ruffell <sruffell@digium.com>
Signed-off-by: Russ Meyerriecks <rmeyerriecks@digium.com>
While not required by the protocol to the DTE, this does help when debugging the
trace files.
Signed-off-by: Shaun Ruffell <sruffell@digium.com>
Signed-off-by: Russ Meyerriecks <rmeyerriecks@digium.com>
Furthermore, do it as soon as we know we should to prevent the ack from
potentially going out after another CSM_ENCAPS packet on another CPU.
Previously, we would not send ACKS to responses we believed we already responded
to.
Signed-off-by: Shaun Ruffell <sruffell@digium.com>
Signed-off-by: Russ Meyerriecks <rmeyerriecks@digium.com>
Eliminates some cases where there are duplicated packets in the capture if the
hardware descriptor ring was already full.
Signed-off-by: Shaun Ruffell <sruffell@digium.com>
Signed-off-by: Russ Meyerriecks <rmeyerriecks@digium.com>
Makes the channels themselves behave like the supervisor channel. This only
protects the driver in the case the commands were severly backed up, like when
there was high packet loss.
Signed-off-by: Shaun Ruffell <sruffell@digium.com>
Signed-off-by: Russ Meyerriecks <rmeyerriecks@digium.com>
They are sufficiently protected by the list locks. This also cleans up a case
where the tcp was unlocked after already completing it, which was corrupting the
list.
Signed-off-by: Shaun Ruffell <sruffell@digium.com>
Signed-off-by: Russ Meyerriecks <rmeyerriecks@digium.com>
In case we missed an alert, this will allow for rapid shutdown of Asterisk.
Signed-off-by: Shaun Ruffell <sruffell@digium.com>
Signed-off-by: Russ Meyerriecks <rmeyerriecks@digium.com>
Even if it is duplicated or we don't have an outbound message waiting, we should
ack it so that the firmware does not keep trying to send it to the host.
Otherwise the firmware could get into situations where it was constantly
retrying to send packets for which it did not receive our previous ACK and
exhaust memory.
I was only seeing this on platforms were packets were going missing in the
stream, increasing the probability that the driver would miss early responses.
Signed-off-by: Shaun Ruffell <sruffell@digium.com>
Signed-off-by: Russ Meyerriecks <rmeyerriecks@digium.com>
The kernel log will now contain reports if there are bus errors.
This is a troubleshooting aide on systems with bus issues.
Signed-off-by: Shaun Ruffell <sruffell@digium.com>
Signed-off-by: Russ Meyerriecks <rmeyerriecks@digium.com>
Clarifies that the semaphore was being used as a mutex. Mutexes are also more
efficient and allow better debugging checks.
Signed-off-by: Shaun Ruffell <sruffell@digium.com>
Signed-off-by: Russ Meyerriecks <rmeyerriecks@digium.com>
The ioctls for the debug network interface on the tc400b0 has not been used for
a long time. It is now gone.
This will also allow the sempahore set in the ioctl to be changed into a mutex
which provides enhanced debugging checks.
Signed-off-by: Shaun Ruffell <sruffell@digium.com>
Signed-off-by: Russ Meyerriecks <rmeyerriecks@digium.com>
When an unsolicited alert is received, we'll flag the card as halted so that
commands will not be retried. This is because often times the firmware will no
longer process any commands in this state and the driver will hold processes in
the dstate while waiting to retry the commands.
This is a debugging aide in that it simplies unloading the driver if the card /
driver is currently in a failed state.
Signed-off-by: Shaun Ruffell <sruffell@digium.com>
Signed-off-by: Russ Meyerriecks <rmeyerriecks@digium.com>
The read-line-multiple command was already disabled on the voicebus cards, which
use the same interface, in commit 4de462c3e0. This
does the same thing for the transcoder card and also disables the read line
command.
I've seen this change directly correlated to problems with the AN983 receiving
packets from the onboard DSP on some platforms.
Internal-Issue-ID: DAHDI-1071
Signed-off-by: Shaun Ruffell <sruffell@digium.com>
Signed-off-by: Russ Meyerriecks <rmeyerriecks@digium.com>
I've not seen this directly tied to any issue, but it's enabled on the voicebus
cards and so brings the wctc4xxp driver in line.
Signed-off-by: Shaun Ruffell <sruffell@digium.com>
Signed-off-by: Russ Meyerriecks <rmeyerriecks@digium.com>
Fixes regression from bb63d03bba (before
v2.7.0). This failed to set the PCM mask on a CAS span when
DAHDI_AUDIO_NOTIFY was not set.
As the first channel of each xbus would be enabled (for
synchronization), a single call may still have passed.
This patch sets the PCM mask on any CAS channel explicitly.
Signed-off-by: Tzafrir Cohen <tzafrir.cohen@xorcom.com>
* It's currently harmless (just re-run the pre/post XPD registrations)
* But it's cleaner this way (as with xbus_register_dahdi_device())
Signed-off-by: Tzafrir Cohen <tzafrir.cohen@xorcom.com>
XPP devices have implicit support for device registration and
unregistration. Even though it is only used for the legacy (non-hotplug)
configuration case, we still prefer to make it explicit.
This attribute would later allow a simpler implementation of the user
space (xpp-specific) tool dahdi_registration.
Signed-off-by: Tzafrir Cohen <tzafrir.cohen@xorcom.com>
In an Astribank with >= 2 PRI ports, switching from E1 to T1 at run-time
may fail at the DAHDI_CHANCONFIG ioctl on the first channel in a span,
That is, on first run of dahdi_cfg, it fails on second span, on second:
it fails on third span, etc.
The code clears the D-channel information on the DAHDI_CHANCONFIG call
for the first channel in the span.
However The code tested for the global "channo" rather than the per-span
"chanpos" to check for the first channel in the span. This the test
failed.
Signed-off-by: Tzafrir Cohen <tzafrir.cohen@xorcom.com>
/usr/lib/hotplug/firmware is an old location not used since the move
from the old hotplug system. We no longer need to support it. No need to
keep two copies of the firmware files.
Acked-by: Russ Meyerriecks <rmeyerriecks@digium.com>
Signed-off-by: Shaun Ruffell <sruffell@digium.com>
This is to support users who are unable to update to the lastest CentOS 5.x.
There is no change for most users on the latest releases of their distribution.
Signed-off-by: Shaun Ruffell <sruffell@digium.com>
The timer_list member of the transcoder buffer has not been used for a long
time. Saves some space in the buffer in addition to removing any future
curiosity about what it is for.
Signed-off-by: Shaun Ruffell <sruffell@digium.com>
Normally the board drivers should define this, but if they do not we will
provide a suitable default. This allows compilation with vanilla Linux 2.6.18.
Signed-off-by: Shaun Ruffell <sruffell@digium.com>
If the memory allocation for the new channel array fails, it would be possible
to call enable_irq() without the corresponding call to disable_irq().
Signed-off-by: Shaun Ruffell <sruffell@digium.com>
This fixes a regression introduced first in release 2.9.1 with commit
(7ce8498465 "firmware: Refactor by using build_tools/install_firmware.")
which prevents from installing the firmware in a location other than the system
root.
Bug: https://issues.asterisk.org/jira/browse/DAHLIN-337
Reported-by: Anthony Messina <amessina@messinet.com>
Signed-off-by: Tzafrir Cohen <tzafrir.cohen@xorcom.com>
[edited the commit message]
Signed-off-by: Shaun Ruffell <sruffell@digium.com>
* dahdi_registration writes multiple times to:
/sys/bus/astribanks/devices/*/*/span
* In some race cases this resulted in corruption and eventual kernel
panic.
* Until migration to "assigned-spans" is complete:
- Accept and ignore multiple "dahdi registrations" from user-space.
Signed-off-by: Tzafrir Cohen <tzafrir.cohen@xorcom.com>
When the card goes through a reset the PCIe link will be brought down. Some
slots will report this change upstream to the root port which will believe that
the card has been hotplugged out of the system.
This fixes cases on some systems where, during a firmware update, the card gets
removed from the system's logical PCI tree with messages like the following in
the kernel log:
pciehp 0000:00:1c.0:pcie04: Card not present on Slot(259)
pciehp 0000:00:1c.0:pcie04: Card present on Slot(259)
Internal-Issue-ID: DAHDI-1091
Signed-off-by: Shaun Ruffell <sruffell@digium.com>
This firmware image is able to handle system conditions that would result in
spans internmittenly going down and recovering.
Internal-Issue-ID: DAHDI-1087
Signed-off-by: Shaun Ruffell <sruffell@digium.com>
These firmware images can handle system conditions that would previously result
in transmitted audio being intermittently silenced and then unsilenced.
Internal-Issue-ID: DAHDI-1087
Signed-off-by: Shaun Ruffell <sruffell@digium.com>
This firmware image is able to handle system conditions that would result in
spans going down and then coming back intermittently.
Signed-off-by: Shaun Ruffell <sruffell@digium.com>
There are some platforms where the DMA operation to the host does not complete
within the required 50us. When this happens some versions of the firmware will
issue a retry and increment the retry field in the descriptor status word.
Now if this driver is configured in DEBUG mode, the presence of these retries
will be printed to the kernel log as a potential troubleshooting aide.
Internal-Issue-ID: DAHDI-1087
Signed-off-by: Shaun Ruffell <sruffell@digium.com>
There are error conditions that the firmware can detect but cannot recover from
without help from the driver. When firmware detects these conditions the
DESC_IO_ERROR bit will be set in the descriptor header to signal that the driver
should reset the TDM engine on the card. Since this is not due to failure of the
host to service the interrupt in time, it does not make sense to increase
latency when these conditions are detected.
Internal-Issue-ID: DAHDI-1087
Signed-off-by: Shaun Ruffell <sruffell@digium.com>
Without this change it's hard to see what is actually running on the card when
the firmware in the flash doesn't match the reported version. This is a
debugging aide.
Signed-off-by: Shaun Ruffell <sruffell@digium.com>
This allows a .tar.gz file that is normally downloaded from the server to be
checked directly into a test branch. Now when the firmware is being installed,
it can be extracted from any present .tar.gz. Again, this is primarily to
simplify testing.
A nice side effect is that when going from a newer version to an older version,
any stale .bin files in the firmware directory will not be installed into
/lib/firmware, but instead the bin file from the *versioned* .tar.gz will always
be installed. This should reduce problems where you have to run "make
dist-clean" when reverting to an older version that has a firmware change.
Signed-off-by: Shaun Ruffell <sruffell@digium.com>
This is intended to simplify the Makefile and move a large common part into a
subscript. The behavior is the same as it was before.
Signed-off-by: Shaun Ruffell <sruffell@digium.com>
* We didn't handle proper E1/T1 transitions after device registration.
* Fix SPAN_REGISTERED(xpd):
It now checks for DAHDI_FLAGBIT_REGISTERED as well, as this flag is
set/clear by assign/unassign.
* From set_pri_proto():
- Always free/allocate channels
- Always call dahdi_init_span()
* Improve phonedev_cleanup() safety:
- NULL pointers after free.
- Zero number of channels at the end.
* Refactor channel allocation out of phonedev_init():
- Into phonedev_alloc_channels()
- Also called from xpd_init_span() to prevent duplicated logic.
- And called from set_pri_proto() to prevent our bug.
Signed-off-by: Tzafrir Cohen <tzafrir.cohen@xorcom.com>
Commit 74e949c33a exported the module
variable dahdi.auto_assigned_spans. However this is not a good name.
This commit reverts the export and replaces it with a new function to
get the value of the variable.
Signed-off-by: Tzafrir Cohen <tzafrir.cohen@xorcom.com>
Acked-by: Russ Meyerriecks <rmeyerriecks@digium.com>
A fix to 03b3ce1a10: use ktime_get_ts
instead of getnstimeofday to better handle system time changes.
(Shaun Ruffell)
Signed-off-by: Tzafrir Cohen <tzafrir.cohen@xorcom.com>